Masking or Mitigating? Deconstructing the Impact of Query Rewriting on Retriever Biases in RAG

2026-04-07Information Retrieval

Information Retrieval
AI summary

The authors studied how different ways of changing search queries affect biases in dense retriever systems used in retrieval-augmented generation. They found that simple rewriting of queries using large language models helps reduce biases overall but struggles when multiple biases occur together. They explain that some methods reduce bias by making scores more varied, while others decorrelate features causing bias. However, no single method fixes all biases for every retriever, so choosing the right approach depends on the specific bias problem. They also clarify different types of biases in these systems, showing limits to improving results just by modifying queries.

Dense retrieversRetrieval-augmented generationQuery rewritingSystematic biasLarge language modelsScore variancePseudo-document generationQuery-document interaction biasDocument encoding bias
Authors
Agam Goyal, Koyel Mukherjee, Apoorv Saxena, Anirudh Phukan, Eshwar Chandrasekharan, Hari Sundaram
Abstract
Dense retrievers in retrieval-augmented generation (RAG) systems exhibit systematic biases -- including brevity, position, literal matching, and repetition biases -- that can compromise retrieval quality. Query rewriting techniques are now standard in RAG pipelines, yet their impact on these biases remains unexplored. We present the first systematic study of how query enhancement techniques affect dense retrieval biases, evaluating five methods across six retrievers. Our findings reveal that simple LLM-based rewriting achieves the strongest aggregate bias reduction (54\%), yet fails under adversarial conditions where multiple biases combine. Mechanistic analysis uncovers two distinct mechanisms: simple rewriting reduces bias through increased score variance, while pseudo-document generation methods achieve reduction through genuine decorrelation from bias-inducing features. However, no technique uniformly addresses all biases, and effects vary substantially across retrievers. Our results provide practical guidance for selecting query enhancement strategies based on specific bias vulnerabilities. More broadly, we establish a taxonomy distinguishing query-document interaction biases from document encoding biases, clarifying the limits of query-side interventions for debiasing RAG systems.