LLM REgression with a Latent Iterative State Head

2026-04-01Computation and Language

Computation and LanguageMachine Learning
AI summary

The authors introduce RELISH, a new method for predicting numbers from text using large language models without changing the main model itself. Instead of trying to turn text into numbers directly, RELISH refines an internal state step-by-step by looking closely at word details and then predicts a single number from this refined information. They tested RELISH on several datasets and models, finding it works better than older methods while needing very few extra parameters, making it efficient and lightweight. This approach keeps the main language model fixed and only adds a small, smart component to do the regression.

text regressionlarge language modelslatent statecross-attentionscalar predictionparameter efficiencyautoregressive decodingLoRAfrozen modelslinear regressor
Authors
Yiheng Su, Matthew Lease
Abstract
We present RELISH (REgression with a Latent Iterative State Head), a novel, lightweight architecture designed for text regression with large language models. Rather than decoding numeric targets as text or aggregating multiple generated outputs, RELISH predicts scalar values directly from frozen LLM representations by iteratively refining a learned latent state through cross-attention over token-level representations, and then mapping the final state to a point estimate with a linear regressor. Across five datasets, four LLM backbones, and two LLM training regimes, RELISH consistently outperforms prior baselines from all three major LLM regression families, including autoregressive decoding, regression-aware inference, and existing predictive head methods. Despite these gains, RELISH remains highly parameter-efficient, requiring only 3.4-3.7M trainable parameters across frozen LLM backbones (only 0.01-0.04% additional overhead), far less than LoRA-based alternatives that grow with model size (0.26-0.42%).