Task-Adaptive Retrieval over Agentic Multi-Modal Web Histories via Learned Graph Memory

2026-04-09Information Retrieval

Information RetrievalArtificial Intelligence
AI summary

The authors present ACGM, a system that helps find important information from long and mixed types of web interaction histories by adapting to the current task. Unlike older methods that use fixed rules, their approach learns which parts of history matter most using a graph that changes based on how relevant past events are, even considering that images lose importance faster than text. Their method is faster and more accurate than many existing techniques, improving retrieval scores on several benchmarks. They make their code publicly available for others to try.

multi-modal retrievalgraph memorypolicy-gradient optimizationtemporal dynamicstask-adaptive relevancenDCGprecisionweb interaction historymodality-specific decayretrieval systems
Authors
Saman Forouzandeh, Kamal Berahmand, Mahdi Jalili
Abstract
Retrieving relevant observations from long multi-modal web interaction histories is challenging because relevance depends on the evolving task state, modality (screenshots, HTML text, structured signals), and temporal distance. Prior approaches typically rely on static similarity thresholds or fixed-capacity buffers, which fail to adapt relevance to the current task context. We propose \textbf{ACGM}, a learned graph-memory retriever that constructs \emph{task-adaptive} relevance graphs over agent histories using policy-gradient optimization from downstream task success. ACGM captures heterogeneous temporal dynamics with modality-specific decay (visual decays $4.3\times$ faster than text: $λ_v{=}0.47$ vs.\ $λ_x{=}0.11$) and learns sparse connectivity (3.2 edges/node), enabling efficient $O(\log T)$ retrieval. Across WebShop, VisualWebArena, and Mind2Web, ACGM improves retrieval quality to \textbf{82.7 nDCG@10} (+9.3 over GPT-4o, $p{<}0.001$) and \textbf{89.2\% Precision@10} (+7.7), outperforming 19 strong dense, re-ranking, multi-modal, and graph-based baselines. Code to reproduce our results is available at{\color{blue}\href{https://github.com/S-Forouzandeh/ACGM-Agentic-Web}{Saman Forouzandeh}}.