EviProp: Seeded Relevance Diffusion on Chunk-Page Graphs for Long Multimodal Document Retrieval

2026-06-08 • Information Retrieval

Information Retrieval

AI summaryⓘ

The authors identify a problem with current methods that find relevant pages in long, complex documents: they look at each page separately, which can miss important pages that rely on small details or connections within the document. To fix this, the authors created EviProp, which treats the document like a network of related chunks and pages, spreading relevance from key parts to others using a technique called Personalized PageRank. Their experiments show this method finds evidence pages better and improves question answering accuracy without slowing down retrieval much. The authors also shared their code for others to use.

document question answeringvisual retrievalPersonalized PageRankmultimodal graphevidence-page retrievalMMLongBench-DocLongDocURLquery-page similarityseeded relevance diffusion

Authors

Hongwei Zhang, Xiaoman Wang, Zehui Ling, Ruicheng Zhu, Yue Zhang, Pinlong Cai, Fuke Shen, Botian Shi, Tongquan Wei, Guohang Yan

Abstract

Retrieving evidence pages from visually rich long documents is a key challenge in document question answering. Existing page-level visual retrievers operate under an independent matching paradigm: each page is scored in isolation based on query-page similarity. This paradigm can under-rank evidence pages whose signals are localized in fine-grained chunks or depend on document-internal associations. We propose EviProp, a retrieval method that recovers such pages via seeded relevance diffusion. EviProp models each document as a multimodal Chunk-Page graph with hierarchical, sequential, and similarity links. Given a query, it combines dense visual page priors with sparse chunk seeds, then runs Personalized PageRank to diffuse relevance over the graph. Experiments on MMLongBench-Doc and LongDocURL show consistent gains in evidence-page retrieval over independent visual retrieval and text-visual fusion baselines. Downstream QA results further show that improved retrieval translates into better answer accuracy, with negligible online retrieval overhead. Our code is released at https://github.com/Flyecnu/EviProp.

View PDFOpen arXiv