Active Inference as the Test-Time Scaling Law for Physical AI Agents

2026-06-22 • Artificial Intelligence

Artificial Intelligence

AI summaryⓘ

The authors introduce a new rule for physical AI agents that helps them adapt to unexpected situations during use, not just during training. They base this rule on a theory called active inference, which guides agents to reduce mistakes by updating their behavior in real time. This updating process is like the brain adjusting decisions based on new information, making the AI better at handling changes in its environment. The authors also develop a practical method to apply their idea, showing in simulations that it works better than other common learning methods, especially in tasks like self-driving cars.

Physical artificial intelligenceActive inferenceScaling lawPolicy updateBayesian inferenceVariational inferenceFree energy principleGeneralizationReinforcement learningNon-stationary environments

Authors

Omar Hashash, Christo Kurisummoottil Thomas, Walid Saad, Merouane Debbah, Karl Friston, Adeel Razi

Abstract

In this paper, a novel test-time scaling law for physical artificial intelligence (AI) agents is introduced. This scaling law enables physical AI agents to reason with their world models to generalize in unforeseen scenarios at test time. The derived scaling law is grounded in the first principle of active inference, which equips agents with the general objective to survive in the real world, under which their specific task objectives are subsumed. Active inference achieves this by providing the reasoning to resolve prediction errors that arise when the agent encounters unforeseen situations outside its training distribution, enabling generalization in non-stationary environments. The proposed scaling law captures this by dynamically updating the agent's policy with this reasoning at test time. This policy update is modeled as a soft Bayesian inference process in which beliefs about the policy are updated using the reasoning that reduces expected prediction errors under allowable policies as a likelihood. The resulting posterior policy admits a biological interpretation, recovering the scaling mechanism that engages the brain's basal ganglia and prefrontal cortex at test time. To solve this analytically intractable problem, a variational inference solution minimizing free energy bounds is developed. This solution extends to enable learning beyond training by reinforcing new instances, resolved at test time, in both the policy and world model. Unlike existing scaling laws constrained by model size and training data, the derived solution scales with the continuous real-world experience of a physical AI agent. Simulation results on an autonomous driving task demonstrate that the proposed solution outperforms model-free Q-learning and model-based Bayesian reinforcement learning, achieving robust generalization to unforeseen scenarios while improving inference efficiency by over 36%.

View PDFOpen arXiv