ATLAS: Active Theory Learning for Automated Science

2026-06-10Machine Learning

Machine LearningArtificial Intelligence
AI summary

The authors created a system called ATLAS that helps scientists learn how people behave by automatically designing smart experiments. ATLAS makes guesses about behavior using special neural networks and then chooses new tests to tell these guesses apart. They showed ATLAS works well by testing it on tasks where agents learn from rewards and comparing it to random or expert experiments. Their system needed fewer experiments to understand the agents, showing a faster way to study behavior models.

active learningmechanistic modelingreinforcement learningbandit taskssparse neural networksDisentangled RNNssample efficiencybehavioral modelingexperiment design
Authors
Noémi Éltető, Nathaniel D. Daw, Kimberly L. Stachenfeld, Kevin J. Miller
Abstract
Advancing scientific understanding through mechanistic modeling requires posing the right experimental questions to yield maximally informative data. To automate this pursuit within cognitive science, we introduce ATLAS (Active Theory Learning for Automated Science), an active learning framework for the data-driven discovery of interpretable behavioral models. ATLAS iterates between generating mechanistic hypotheses--instantiated as a diverse ensemble of sparse neural networks (Disentangled RNNs)--and designing experiments that optimally distinguish between them. We test this approach on the problem of recovering reinforcement learning agents from their behavior in bandit tasks. ATLAS designs varied sequences of qualitatively novel experiments with temporal structure tailored to underlying agent characteristics. The models trained on these experiments are evaluated against a comprehensive set of metrics for mechanistic modeling that capture behavioral, structural, and computational similarity. ATLAS achieves a 5-10x improvement in sample efficiency across all metrics compared to random experimentation, and its performance is further validated against expert-designed experiments derived from literature. These in silico results showcase ATLAS's potential to accelerate human-interpretable insights in cognitive science and other domains where scientific inquiry relies on discovering mechanistic models.