RaMem: Contextual Reinstatement for Long-term Agentic Memory

2026-06-22 • Artificial Intelligence

Artificial IntelligenceMultiagent Systems

AI summaryⓘ

The authors address a problem in memory systems for AI agents, where memories from different situations look relevant but actually lack enough context to be useful. They call this issue 'context collapse' and propose a method called RaMem to fix it by making sure memories are linked back to their original situations before being used. RaMem works through four steps to check how well memories fit the current query and keep important context when combining them. Their tests show RaMem improves memory retrieval performance in AI agents by over 10% compared to other methods.

long-term memoryLLM agentscontext collapseepisodic memoryretrievalevidence validitymemory compressioncontextual reinstatementmemory benchmarksF1 score

Authors

Wei Yang, Bryce Kan, Shixuan Li, Li Li, Yuehan Qin, Jiate Li, Paul Bogdan, Jesse Thomason

Abstract

Long-term memory has become increasingly important for LLM agents that operate across extended interactions and evolving task contexts. Recent memory systems have made past experiences more persistent, compact, and retrievable, but retrieval alone does not ensure that a memory provides valid evidence for the current query. When experiences are compressed into reusable fragments, memories from different situations may appear equally relevant if they involve recurring entities or user states. We refer to this failure as context collapse: memories lose the surrounding context needed to judge whether they provide valid evidence for the current query. To address this problem, we propose Contextual Reinstatement for Agentic Memory (RaMem), a framework that turns retrieved memory fragments into contextually verifiable evidence. RaMem operates through four coordinated stages: (i) evidence anchoring grounds each memory in its original episodic conditions, especially event time, mention time, session span, and participants; (ii) recall condition induction derives the evidence conditions implied by the query; (iii) validity-aware retrieval uses these conditions to prioritize context-compatible memories while retaining content-relevant candidates as fallback evidence; and (iv) context-preserved synthesis keeps the selected memories' structured context available to the generator. Experiments on long-term memory benchmarks show that RaMem consistently improves performance over strong memory baselines, with average F1 gains of more than 10% across several backbones.

View PDFOpen arXiv