Data-driven discovery of governing differential equations across physical systems

2026-06-08 • Machine Learning

Machine LearningSymbolic Computation

AI summaryⓘ

The authors explain that differential equations are important for describing how natural things behave. They focus on discovering these equations directly from data, especially when the underlying science isn’t well known. They introduce a way to understand the challenges of this discovery based on how complicated the equations are. They also present a framework called REO that breaks down the discovery process into key steps, helping to understand different methods better. Finally, they suggest the future goal is to use discovered equations to improve scientific theories and ideas, not just find equations.

differential equationsdata-driven discoverygoverning lawsstructural complexitycoefficient complexityrepresentation-evaluation-optimization (REO)sparse equationsparameterizationscientific theoriesalgorithmic frameworks

Authors

Siyu Lou, Hao Xu, Wenguan Wang, Lu Lu, Hao Sun, Yang Liu, Linfeng Zhang, Dongxiao Zhang, Yuntian Chen

Abstract

Differential equations play a critical role in scientific discovery because they provide a mathematical framework to describe the behaviour of physical phenomena. As a promising alternative to traditional first principles, data-driven differential equation discovery has attracted increasing attention for its ability to infer governing laws directly from experimental or simulated data, especially when the underlying physics is unclear. However, the field has expanded rapidly along diverse methodological directions, particularly with the emergence of AI-based approaches, and still lacks a clear organizing perspective. In this Review, we propose a problem-oriented perspective on data-driven differential equation discovery. We first introduce a two-dimensional phase diagram of equation discoverability, where discovery problems are organized according to structural complexity and coefficient complexity. This phase diagram shows how the field has moved from the discovery of sparse equations with simple coefficients toward more complex governing laws with richer structures and more flexible parameterizations. It also clarifies why different methodological families succeed or fail in different problem settings. We then present the representation-evaluation-optimization (REO) framework as a fundamental abstraction of the discovery process. By identifying the core problems of equation discovery that persist across algorithmic variations, REO shifts the discussion from individual algorithms to the fundamental principles that determine discoverability. We connect these perspectives to applications across physics and adjacent sciences, and argue that the next challenge is not merely recovering equations, but using them to revise existing theories, distil mechanisms and form new scientific concepts.

View PDFOpen arXiv