Data-Driven Energy-Based Learning via Gibbs Measures on Hierarchical Structures

2026-06-29Machine Learning

Machine Learning
AI summary

The authors present a new way to think about learning from data using ideas from physics called Gibbs measures, which consider many possible learned models instead of just one best fit. They turn the loss from data into an energy that defines a probability distribution over different learning outcomes, capturing a range of equilibrium states. Their work connects traditional loss landscapes to probabilistic models on tree-like structures and finds conditions where multiple learning states can exist, showing that learning can have phase transitions. They also provide mathematical results for when unique or multiple solutions occur and demonstrate this with numerical examples.

Gibbs measureEmpirical risk minimizationEnergy-based modelsHierarchical structuresFixed-point equationsCayley treePhase transitionProbabilistic inferenceIntegral equationsTranslation-invariant solutions
Authors
L. U. Abdullaev, F. Herrera, U. A. Rozikov, M. V. Velasco
Abstract
We introduce a data-driven probabilistic framework for learning systems based on Gibbs measures on hierarchical structures. Unlike standard empirical risk minimization, where a dataset is used to identify a single optimal parameter, our approach transforms the empirical loss function into an interaction potential defining an energy-based model. The resulting Gibbs distribution describes a family of equilibrium learning states generated by the data. We formulate the consistency conditions of the associated finite-volume distributions and derive nonlinear integral fixed-point equations whose solutions characterize the admissible learning states. These equations provide a rigorous connection between empirical loss landscapes and probabilistic inference on trees. For translation-invariant solutions, the problem reduces to the analysis of positive compact operators induced by data-dependent kernels, allowing us to establish existence and uniqueness conditions in the one-dimensional setting. Furthermore, we show that hierarchical learning systems may exhibit phase-transition phenomena: for certain empirical kernels on Cayley trees, multiple Gibbs measures emerge beyond a critical inverse temperature, corresponding to distinct equilibrium prediction regimes. Numerical experiments with non-separable kernels illustrate the appearance of multiple solution branches and demonstrate the coexistence of several data-induced learning states. Our results provide a new perspective on energy-based learning, where data do not merely determine an optimal model through minimization but define an entire probabilistic landscape of possible inference states.