ERBench: A Benchmark and Testsuite for Equation Discovery Algorithms

2026-06-08Machine Learning

Machine Learning
AI summary

The authors explain that finding scientific formulas from data usually uses a method called symbolic regression. Measuring how well these methods work is tricky because simple tests don’t always show if the method can find the right formulas in new situations. To solve this, the authors focus on checking if symbolic regression can recover known exact formulas as a way to judge their quality. They also point out that existing tests don’t cover important challenges like changes in data size or noise. To help with this, the authors created a new test called ERBench to better evaluate these algorithms in realistic conditions.

Equation discoverySymbolic regressionPrediction accuracyEquation recoveryOut-of-domain testingBenchmarkDimensionalitySampling distributionNoisy dataModel generalization
Authors
Paul Kahlmeyer, Henrik Voigt, Michael Habeck, Joachim Giesen
Abstract
Equation discovery aims to automate the discovery of scientific models in the form of mathematical equations from data. Technically, equation discovery is implemented by symbolic regression algorithms. Performance of symbolic regression for equation discovery is measured along two dimensions: Prediction accuracy on test data, and recovery of known groundtruth formulas. For standard regression, accuracy is typically measured on in-domain test data, for instance, by splitting a data set randomly into training and test data. While this makes sense for in-domain interpolation, which is the common goal in ordinary regression, it can be a misleading proxy for true model discovery and generalization. The obvious alternative is to measure out-of-domain accuracy. However, obtaining challenging out-of-domain test data is a non-trivial problem. Therefore, we focus on equation recovery for evaluating symbolic regression algorithms for equation discovery. The rationale is that symbolic regression algorithms that perform well in recovering known groundtruth formulas are good candidates to perform well in unknown equation discovery. Existing benchmarks for symbolic regression include equation recovery tasks, however, with only a small number of groundtruth formulas that are publicly known. Moreover, these benchmarks place less emphasis on evaluating the robustness of algorithms in terms of their behavior under changing dimensionality, sampling size, sampling distribution and sampling domain. This, however, is of central importance to practitioners wanting to discover equations for modeling natural phenomena, since data is almost certainly noisy and comes from diverse domains, distributions, and sample sizes. To fill this gap, we introduce the Equation Recovery Benchmark (ERBench), a new evaluation framework designed to rigorously assess algorithms explicitly targeting the task of equation discovery.