Training a Predictive Coding Network on ImageNet using Equilibrium Propagation
2026-06-02 • Machine Learning
Machine LearningNeural and Evolutionary Computing
AI summaryⓘ
The authors explored a training method called Equilibrium Propagation (EP), which is inspired by physics, and applied it to predictive coding networks (PCNs), a type of brain-inspired model. They designed a new way to train PCNs using EP and demonstrated it on a large image recognition task (ImageNet) with competitive accuracy compared to standard methods. This is the first time both PCNs and EP have been shown to work effectively at such a large scale. Their work suggests that technical challenges in scaling EP may be due to how physical systems compute, not the EP method itself.
Equilibrium PropagationPredictive Coding NetworksEnergy-Based ModelsImageNetConvolutional Neural NetworksVGG10BackpropagationTop-5 Classification ErrorComputational NeuroscienceTraining Algorithms
Authors
Tugdual Kerjan, Rasmus Høier, Benjamin Scellier
Abstract
Equilibrium Propagation (EP) is a physics-based training framework that has primarily been employed in energy-based models, including continuous Hopfield networks, nonlinear resistive networks and coupled phase oscillators. However, EP's practical applications have so far remained limited to relatively small-scale problems. Predictive coding networks (PCNs), another class of energy-based models rooted in computational neuroscience, are typically trained with a specialized algorithm and have likewise not yet been demonstrated at large scale. In this work, we develop an EP-based training method for PCNs which combines the centered variant of EP with a novel equilibration scheme for PCNs. Using this approach, we train a 10-layer convolutional PCN (VGG10) on full-size ImageNet, achieving 13.23\% test error rate on the top-5 classification task, close to the 12.2\% backpropagation baseline. To our knowledge, this is the first demonstration of both PCNs and EP-based training at ImageNet scale. These results significantly extend the scalability of both approaches and suggest that the primary challenges in scaling EP in other physical systems may come more from the computational properties of these systems than from inherent limitations of the EP framework.