AI Agents Can Already Autonomously Perform Experimental High Energy Physics

2026-03-20 • Artificial Intelligence

Artificial IntelligenceMachine Learning

AI summaryⓘ

The authors show that AI tools based on large language models can independently perform many steps of a high energy physics analysis using existing data and literature. They created a framework called Just Furnish Context (JFC) that combines autonomous agents with knowledge retrieval and peer review to carry out credible physics studies. They tested this on real data sets to measure important physics phenomena. The authors suggest these tools can reduce repetitive coding work, allowing physicists to focus on new ideas and validation. They also recommend that the physics community rethink how students are trained and how analyses are organized given these advances.

Large language modelsHigh energy physics (HEP)Event selectionBackground estimationUncertainty quantificationStatistical inferenceALEPHDELPHICMSAutonomous analysis agents

Authors

Eric A. Moreno, Samuel Bright-Thonney, Andrzej Novak, Dolores Garcia, Philip Harris

Abstract

Large language model-based AI agents are now able to autonomously execute substantial portions of a high energy physics (HEP) analysis pipeline with minimal expert-curated input. Given access to a HEP dataset, an execution framework, and a corpus of prior experimental literature, we find that Claude Code succeeds in automating all stages of a typical analysis: event selection, background estimation, uncertainty quantification, statistical inference, and paper drafting. We argue that the experimental HEP community is underestimating the current capabilities of these systems, and that most proposed agentic workflows are too narrowly scoped or scaffolded to specific analysis structures. We present a proof-of-concept framework, Just Furnish Context (JFC), that integrates autonomous analysis agents with literature-based knowledge retrieval and multi-agent review, and show that this is sufficient to plan, execute, and document a credible high energy physics analysis. We demonstrate this by conducting analyses on open data from ALEPH, DELPHI, and CMS to perform electroweak, QCD, and Higgs boson measurements. Rather than replacing physicists, these tools promise to offload the repetitive technical burden of analysis code development, freeing researchers to focus on physics insight, truly novel method development, and rigorous validation. Given these developments, we advocate for new strategies for how the community trains students, organizes analysis efforts, and allocates human expertise.

View PDFOpen arXiv