FFR: Forward-Forward Learning for Regression
2026-06-02 • Machine Learning
Machine LearningArtificial Intelligence
AI summaryⓘ
The authors present a new method called FFR that adapts the Forward-Forward algorithm to handle regression tasks, which involve predicting continuous values. They solve the problem of lacking natural opposites in regression by introducing a way to compare groups of neurons using ordinal information about target values. Their design includes a layered approach where early layers learn broad categories and deeper layers refine predictions, also estimating uncertainty. Experiments show that FFR performs nearly as well as traditional backpropagation but uses much less memory and runs faster. It also beats other methods that don't use backpropagation.
Forward-Forward algorithmBackpropagationRegressionOrdinal supervisionNeural networksContrastive learningLayer-wise optimizationUncertainty estimationMulti-scale feature aggregation
Authors
Xinyang Liu, Xuanyu Liang, Shiqi Ding, Boyang Li, Zhiqiang Que, Jiayang Li, Guosheng Hu
Abstract
The Forward-Forward (FF) algorithm offers a computationally efficient and biologically plausible alternative to backpropagation (BP) by training neural networks through purely local, layer-wise optimization. However, FF is inherently designed for classification via contrastive positive-negative sample pairs, and extending it to regression poses fundamental challenges: continuous target space lack natural "opposites" for contrastive learning, and the standard goodness function carries no information about target magnitude or ordering. We propose FFR (Forward-Forward for Regression), to our knowledge, the first framework to extend FF to real-world regression and demonstrate competitive performance across diverse real-world datasets. FFR introduces three key innovations: (1) an ordinal competitive goodness function that replaces contrastive pairs with competitive learning between partitioned neuron groups under distance-aware ordinal supervision; (2) a stratified ladder architecture where shallow layers learn coarse ordinal discrimination and deeper layers refine into fine-grained regression, with multi-scale feature aggregation for inter-layer collaboration; and (3) hierarchical prediction with uncertainty estimation, where multi-scale predictors jointly provide robust predictions and prediction confidence as a free-lunch. Extensive experimental results show FFR recovers on average 98.6% of BP's accuracy across five real-world regression benchmarks while reducing peak training memory to only 27% of BP's at depth 8 and 8% at depth 32, with per-iteration time around 72% of BP's, and substantially outperforms all BP-free competitors.