Data Attribution in Adaptive Learning

2026-04-06 • Machine Learning

Machine Learning

AI summaryⓘ

The authors study machine learning setups where models create their own training data as they learn, like in reinforcement learning. They point out that usual methods for figuring out how each piece of training data affects the model don't work well because each new example changes what data comes next. They develop a new way to measure this effect in a setting where learning happens in stages, and show that you generally can't recover this information just by replaying past data unless certain conditions are met.

machine learningreinforcement learningbanditsadaptive learningdata distribution shiftattribution methodsconditional interventionfinite-horizonlogged data

Authors

Amit Kiran Rege

Abstract

Machine learning models increasingly generate their own training data -- online bandits, reinforcement learning, and post-training pipelines for language models are leading examples. In these adaptive settings, a single training observation both updates the learner and shifts the distribution of future data the learner will collect. Standard attribution methods, designed for static datasets, ignore this feedback. We formalize occurrence-level attribution for finite-horizon adaptive learning via a conditional interventional target, prove that replay-side information cannot recover it in general, and identify a structural class in which the target is identified from logged data.

View PDFOpen arXiv