Memory Contagion: Cross-Temporal Propagation of Evaluator Bias via Agent Memory
2026-06-22 • Machine Learning
Machine LearningArtificial IntelligenceComputation and Language
AI summaryⓘ
The authors studied how biases in the people or systems that guide AI agents can spread over time through the agents' memories. They identified a new problem called Memory Contagion, where biased experiences get stored and passed along, affecting future agents even if the memory system itself works perfectly. Their experiments showed different biases behave differently during memory updates, and even a small amount of bias in the input can lead to noticeable bias later. This work highlights an important weakness in how agent memories are currently designed.
Large Language ModelAgent MemoryBias PropagationMemory ConsolidationEvaluator BiasLength Preference BiasAuthority BiasCross-temporal EffectsBias ContaminationOracle Condition
Authors
Zewen Liu
Abstract
Large Language Model (LLM) agents increasingly rely on memory systems to maintain long-term coherence. Recent work shows that agent memories degrade during continuous consolidation. However, existing research assumes memories are derived from unbiased experiences. In this work, we identify and formalize a novel phenomenon: Memory Contagion -- the cross-temporal propagation of evaluator bias through agent memory. We show that when agents are trained or guided by biased evaluators, their experiences become biased; when these trajectories are stored and consolidated into memory, the bias propagates to future agents retrieving from the same memory store, even when consolidation is perfect (oracle). Across two bias types (length preference, authority bias) and four experimental phases, we demonstrate: (1) Memory Contagion occurs even with perfect consolidation (oracle condition), proving that biased input is a sufficient cause of contagion; (2) Consolidation has opposite effects depending on bias type -- robustly attenuating length bias while preliminarily amplifying authority bias (single-run estimate), suggesting a bias-type-dependent interaction; (3) No observed safe threshold: bias propagation is detected at contamination rates as low as p=0.2. Our findings expose a critical vulnerability in current agent memory designs and provide formal tools for measuring cross-temporal bias propagation.