Whole-Pool Setwise Reranking with Long-Context Language Models

2026-06-01Information Retrieval

Information Retrieval
AI summary

The authors studied how using long-context large language models (LLMs) can make passage ranking faster and less expensive. Instead of ranking passages one by one with many calls, they show all candidate passages at once and introduce a method called DualEnd, which finds the best and worst passages together. This approach requires fewer model calls to rank many passages, saving time. Their experiments show long-context LLMs improve both ranking accuracy and efficiency.

large language modelspassage re-rankinglong-context modelsDualEndwhole-pool re-rankingmodel callsefficiencyranking algorithms
Authors
Hang Li, Chuting Yu, Teerapong Leelanupab, Bevan Koopman, Guido Zuccon
Abstract
Previous LLM-based passage re-rankers are often expensive and slow because the input context constraints require the LLM to make many dependent model calls. We study how recent long-context LLMs change this problem: when the full set of retrieved candidate passages can be shown to the model at once, ranking no longer has to be reconstructed from many overlapping local comparisons. We propose Whole-Pool Setwise re-ranking, where each call considers all currently unranked candidate passages, and introduce DualEnd, which identifies both the most and least relevant passages in one call. By filling the ranking from both ends, DualEnd ranks 100 candidates with 50 serial LLM calls, compared with 99 calls for comparable one-passage-at-a-time whole-pool methods. Experiments with nine open-weight LLMs on two passage re-ranking benchmarks, measuring effectiveness, call count, token use, runtime, and output reliability shows that long context is not merely more prompt space, but an opportunity to make LLM re-rankers both effective and efficient.