A recipe for scalable attention-based MLIPs: unlocking long-range accuracy with all-to-all node attention
2026-03-06 • Machine Learning
Machine LearningComputational Engineering, Finance, and Science
AI summaryⓘ
The authors developed AllScAIP, a machine-learning model that predicts atomic interactions and can handle very large datasets. Unlike previous models that rely on fixed physics rules for long-range forces, their model learns these interactions directly using attention mechanisms between all atoms. They found that for smaller datasets, built-in physics helps, but for bigger datasets and models, their data-driven approach works better. Their model performs well on molecules and materials and allows long, stable simulations that match real experimental results.
machine-learning interatomic potentialslong-range interactionsattention mechanismenergy conservationmolecular dynamicssample efficiencyinductive biasesbiomoleculescatalystsforce accuracy
Authors
Eric Qu, Brandon M. Wood, Aditi S. Krishnapriyan, Zachary W. Ulissi
Abstract
Machine-learning interatomic potentials (MLIPs) have advanced rapidly, with many top models relying on strong physics-based inductive biases. However, as models scale to larger systems like biomolecules and electrolytes, they struggle to accurately capture long-range (LR) interactions, leading current approaches to rely on explicit physics-based terms or components. In this work, we propose AllScAIP, a straightforward, attention-based, and energy-conserving MLIP model that scales to O(100 million) training samples. It addresses the long-range challenge using an all-to-all node attention component that is data-driven. Extensive ablations reveal that in low-data/small-model regimes, inductive biases improve sample efficiency. However, as data and model size scale, these benefits diminish or even reverse, while all-to-all attention remains critical for capturing LR interactions. Our model achieves state-of-the-art energy/force accuracy on molecular systems, as well as a number of physics-based evaluations (OMol25), while being competitive on materials (OMat24) and catalysts (OC20). Furthermore, it enables stable, long-timescale MD simulations that accurately recover experimental observables, including density and heat of vaporization predictions.