AbstRAG: Learning to Abstract for Retrieval Problems

2026-06-08 • Computation and Language

Computation and Language

AI summaryⓘ

The authors address a problem where search queries and document answers don’t match because they talk about things at different levels of detail, which they call the abstraction gap. They created AbstRAG, a method that breaks down this gap into smaller parts and tries to fix mismatches by carefully refining how it matches queries to evidence. Their system improves retrieval performance and accuracy on several tests by using a step-by-step checking and fixing process to avoid errors. Most of the improvement comes from this refinement process, which also stops the system from adding too much unnecessary information.

retrieval-augmented generationabstraction gapreflective refinementquery intentevidence matchingnDCG@10utility priorablation studycompression controlpaired-bootstrap test

Authors

Lei Xu, Xin Quan, Daniel Pedronette, André Freitas

Abstract

Retrieval-augmented generation often fails when the query, the document evidence, and the user's intent are expressed at different levels of abstraction. A query may ask about a class, a relation, or an event, while the document only states specific instances, indirect framings, or scoped formulations. We define this mismatch as an abstraction gap: the minimal set of typed assumptions required to align query intent with the available evidence. To close this gap, we introduce AbstRAG, which treats abstraction as an explicit retrieval object. AbstRAG decomposes the query--evidence gap into expression, conceptual, intent--evidence, and event-type components, and scores relevance by combining match quality, a query-independent utility prior, and the cost of the required bridges. Its central mechanism is reflective refinement: a critic diagnoses retrieval failures, localizes the failed abstraction operator, proposes a minimal stage-specific patch, and accepts the patch only under sufficiency and compression controls. Across three within-document retrieval benchmarks against seven baselines, AbstRAG outperforms on nDCG@10 in 18 of 21 paired-bootstrap contrasts and improves generation accuracy by 1.9%, 5.2%, and 4.0% across the three benchmarks; ablations confirm that reflective refinement drives most of the retrieval gain and the compression control alone reduces over-expansion false positives from 73.7% to 0% on a stress slice.

View PDFOpen arXiv