Simultaneous Model-Based Evolution of Constants and Expression Structure in GP-GOMEA for Symbolic Regression

2026-06-01Neural and Evolutionary Computing

Neural and Evolutionary Computing
AI summary

The authors focus on improving a method called GP-GOMEA that builds mathematical expressions to fit data well. They noticed that while GP-GOMEA creates good expression structures, it doesn’t fine-tune the numbers inside those expressions very well. To fix this, they combined techniques to optimize both the expression shape and the numbers at the same time. Their tests showed this combined approach works better than older methods that handled numbers separately or afterwards.

Genetic ProgrammingSymbolic RegressionGP-GOMEAConstant OptimizationEvolutionary AlgorithmsMixed Discrete-Continuous OptimizationModel-Based Evolutionary AlgorithmLinear ScalingEphemeral Random Constants
Authors
Johannes Koch, Tanja Alderliesten, Peter A. N. Bosman
Abstract
Genetic programming (GP) approaches are among the state-of-the-art for symbolic regression, the task of constructing symbolic expressions that fit well with data. To find highly accurate symbolic expressions, both the expression structure and any contained real-valued constants, are important. GP-GOMEA, a modern model-based evolutionary algorithm, is one of the leading algorithms for finding accurate, yet compact expressions. Yet, GP-GOMEA does not perform dedicated constant optimization, but rather uses ephemeral random constants. Hence, the accuracy of GP-GOMEA may well still be improved upon by the incorporation of a constant optimization mechanism. Existing research into mixed discrete-continuous optimization with EAs has shown that a simultaneous and well-integrated approach to optimizing both discrete and continuous parts, leads to the best results on a variety of problems, especially when there are interactions between these parts. In this paper, we therefore propose a novel approach where constants in expressions are optimized at the same time as the expression structure by merging the real-valued variant of GOMEA with GP-GOMEA. The proposed approach is compared to other forms of handling constants in GP-GOMEA, and in the context of other commonly used techniques such as linear scaling, restarts, and constant tuning after GP optimization. Our results indicate that our novel approach generally performs best and confirms the importance of simultaneous constant optimization during evolution.