Simulating Human Memory with Language Models

2026-05-25Computation and Language

Computation and LanguageArtificial Intelligence
AI summary

The authors compare how well language models remember things versus real humans by running classic memory tests. They find that language models usually remember better than humans, even when told to act like humans. By improving the way they prompt the models and using a special method called a compactor, the models can forget things more like humans do. The authors also show that these human-like memory models can work better for simulating users in education tasks, and they share data and tests to help others study this more.

language modelsmemory experimentshuman memoryuser simulationprompting strategiescompactoreducation tasksmemory constraintsbenchmark datasetscognitive modeling
Authors
Qihan Wang, Nicholas Tomlin, Michael Hu, Brian Dillon, Tal Linzen
Abstract
Language models are increasingly being deployed as user simulators, but their memory is far more reliable than that of real users. To measure this gap, we run a series of classic memory experiments from psychology on both humans and language models. Across tasks, we find that out-of-the-box language models exhibit better memory than humans, even when prompted to imitate human behavior. We then show that better prompting strategies and the use of a compactor can cause language models to forget content in a more human-like way. Using these methods, we show preliminary evidence that language models with human-like memory constraints can function as more effective user simulators in a downstream education task. Finally, we release human reference data and benchmarks to support future work on simulating human memory with language models.