Simulate, Reason, Decide: Scientific Reasoning with LLMs for Simulation-Driven Decision Making

2026-06-03Artificial Intelligence

Artificial Intelligence
AI summary

The authors created MechSim, a new system that helps large language models (LLMs) better understand and explain how scientific simulators work, instead of just treating them like black boxes. MechSim breaks down simulators into their basic parts—like assumptions and mechanisms—so the LLM can reason about them and provide clearer explanations. This helps improve transparency and makes decisions based on simulations more trustworthy. The authors tested MechSim in important areas and found it made explanations and analysis more accurate.

Large Language ModelsScientific SimulatorsNeuro-symbolic ReasoningMechanism RepresentationSimulation AssumptionsExecutable SimulatorsStructured SchemaDecision-makingTransparencyAuditability
Authors
Yuhan Yang, Ruipu Li, Alexander Rodríguez
Abstract
Scientific simulators are increasingly being integrated into LLM-driven systems for high-stakes simulation-driven decision-making. However, existing frameworks primarily use LLMs to generate, calibrate, or execute simulators, treating them as black-box interfaces rather than as structured mechanistic systems that can be reasoned about. As a result, current approaches lack the ability to identify, represent, and reason about the assumptions and mechanisms underlying simulator behavior, limiting transparency, auditability, and decision justification. We introduce MechSim, a mechanism-grounded neuro-symbolic reasoning framework for executable scientific simulators. Unlike prior neuro-symbolic approaches that primarily reason over static symbolic structures, MechSim enables LLM agents to reason about the mechanisms, assumptions, and execution behavior of scientific simulators. Our framework represents simulators through a shared structured schema capturing assumptions, variables, mechanism dependencies, and execution traces. On top of this representation, LLM agents operate as constrained reasoning engines that generate structured, evidence-grounded explanations linking simulator outcomes to their underlying mechanisms. We evaluate our approach across multiple high-stakes domains and show that it improves mechanism-level explanation quality, simulator analysis, and downstream decision-making reliability.