eMEM: A Hybrid Spatio-Temporal Memory System For Embodied Agents

2026-06-02Robotics

Robotics
AI summary

The authors introduce eMEM, a new memory system designed for robots or agents that interact with the real world, helping them remember things by meaning, location, and time all at once. Unlike previous systems that store memory as just text or graphs, eMEM combines several storage methods into one graph-based framework and processes observations into summarized memories like the human brain does. They also created eMEM-Bench, a testing setup inspired by human psychology to check how well the memory works. Their tests show eMEM keeps memories more accurate and stable over time compared to simpler methods. Both the memory system and testing tools are shared publicly by the authors.

Embodied agentsGraph-based memorySemantic searchSpatial queriesHippocampal-neocortical consolidationMulti-index architectureLong-term memory retentionCognitive psychology paradigmsMemory benchmarksRetrieval-augmented generation (RAG)
Authors
A. Haroon Rasheed, Maria Kabtoul
Abstract
We present eMEM (Embodied Memory), a hybrid graph-based memory system for embodied agents operating in physical environments. Current agent memory architectures, such as Generative Agents, MemGPT, and A-MEM, treat memory as text streams or knowledge graphs, but embodied agents require memory that is simultaneously searchable by meaning, space, and time. eMEM fills this gap with a multi-index architecture (SQL ITE for structured storage, hnswlib for approximate nearest neighbour semantic search, and an R-tree for spatial queries) unified behind a single graph model. A tiered consolidation pipeline transforms raw perceptual observations into compressed summaries, mirroring hippocampal-neocortical consolidation in biological systems. Ten agent-facing recall tools expose memory retrieval primitives, including concept-to-location resolution and cross layer recall, as first-class operations for LLM tool calling. The system is fully embedded and runs in-process alongside the agent. In addition we introduce eMEM-Bench v1, a benchmark we construct over ProcTHOR-10K scenes for embodied memory evaluation. The benchmark is organised explicitly around eight cognitive-psychology paradigms (DRM lures, pattern separation, pattern completion, source monitoring, context-dependent retrieval, long-horizon interference, serial position, and a foil augmented retention curve), each chosen so that the result is interpretable against the broader memory-systems literature in humans and prior agent-memory systems; a level of diagnostic that surface-task benchmarks like LoCoMo or OpenEQA cannot provide. eMEM scores 80.8 weighted mean over 988 probes, with a flat retention curve at ceiling from 1 h to 1 yr of simulated delay on room-unique items. We show that a pure RAG baseline (the flat_rag ablation) loses 30 pt on context dependent retrieval and 29 pt on DRM lure rejection, isolating the contribution of multi-layer storage and consolidation respectively. We release both the system and the benchmark code.