Bridging Natural Language and Interactive What-If Interfaces via LLM-Generated Declarative Specification

2026-04-08Artificial Intelligence

Artificial IntelligenceHuman-Computer Interaction
AI summary

The authors study how to make 'what-if' analyses easier by turning natural language questions into interactive visual tools. They introduce a two-step process that first uses AI to convert questions into a special code (PSL), which can be checked and fixed if needed, and then turns that code into interactive visuals. Testing their method, they found that half of the initial conversions were correct, and with fixes, the accuracy improved to over 80%. Their work highlights the importance of having an intermediate step to avoid errors and ensure the visuals truly match the user's intent.

What-if analysisNatural language processingLarge language modelsPraxa Specification LanguageInteractive visualizationError taxonomySpecification validationFew-shot learning
Authors
Sneha Gathani, Sirui Zeng, Diya Patel, Ryan Rossi, Dan Marshall, Cagatay Demiralp, Steven Drucker, Zhicheng Liu
Abstract
What-if analysis (WIA) is an iterative, multi-step process where users explore and compare hypothetical scenarios by adjusting parameters, applying constraints, and scoping data through interactive interfaces. Current tools fall short of supporting effective interactive WIA: spreadsheet and BI tools require time-consuming and laborious setup, while LLM-based chatbot interfaces are semantically fragile, frequently misinterpret intent, and produce inconsistent results as conversations progress. To address these limitations, we present a two-stage workflow that translates natural language (NL) WIA questions into interactive visual interfaces via an intermediate representation, powered by the Praxa Specification Language (PSL): first, LLMs generate PSL specifications from NL questions capturing analytical intent and logic, enabling validation and repair of erroneous specifications; and second, the specifications are compiled into interactive visual interfaces with parameter controls and linked visualizations. We benchmark this workflow with 405 WIA questions spanning 11 WIA types, 5 datasets, and 3 state-of-the-art LLMs. The results show that across models, half of specifications (52.42%) are generated correctly without intervention. We perform an analysis of the failure cases and derive an error taxonomy spanning non-functional errors (specifications fail to compile) and functional errors (specifications compile but misrepresent intent). Based on the taxonomy, we apply targeted repairs on the failure cases using few-shot prompts and improve the success rate to 80.42%. Finally, we show how undetected functional errors propagate through compilation into plausible but misleading interfaces, demonstrating that the intermediate specification is critical for reliably bridging NL and interactive WIA interface in LLM-powered WIA systems.