ANN Search: Recall What Matters
2026-06-03 • Information Retrieval
Information RetrievalArtificial IntelligenceDatabasesMachine Learning
AI summaryⓘ
The authors explain that current ways to check if approximate nearest neighbor (ANN) search methods work well rely on measuring how many of the exact neighbors these methods find (Recall@k). However, they argue that it’s more important to measure how close the found neighbors really are, not just if they match exactly. They propose using a new measure called 1/Ratio@k, which compares distances instead of exact matches, and show it better reflects real usefulness in tasks like classification and text generation while needing less computation. Their findings suggest this new measure is a more practical and accurate way to judge ANN performance.
Approximate nearest neighbor (ANN)Recall@k1/Ratio@kNearest neighbor searchDistance metricsInformation retrievalClassificationRetrieval-augmented generationComputational efficiency
Authors
Dimitris Dimitropoulos, Nikos Mamoulis
Abstract
Approximate nearest neighbor (ANN) search has become a core primitive in information retrieval and modern machine learning tasks, from classification to retrieval-augmented generation. The community evaluates and tunes ANN algorithms primarily on their throughput at a given Recall@k, the fraction of true exact neighbors retrieved. We argue that what really matters in ANN search is the quality of the retrieved results and not their overlap with the true kNN set. We show that using Recall@k to assess retrieval quality forces unnecessary computational overhead and investigate replacing it by 1/Ratio@k, the inverse approximation ratio. 1/Ratio@k evaluates the differences between the distances of the retrieved and true neighbors. It is judge-free, hyperparameter-free, and computable from standard ANN benchmark inputs alone. We benchmark state-of-the-art ANN algorithms across diverse datasets spanning a wide range of intrinsic dimensionalities, evaluating the two metrics comprehensively across efficiency, downstream classification, and retrieval-augmented generation. On the efficiency axis, optimizing for 1/Ratio@k reaches operational quality thresholds at a substantially lower computational cost than Recall@k. In downstream tasks, performance indicators (label precision, semantic similarity, BERTScore, and LLM-graded quality) remain highly stable even when Recall@k drops significantly. The inverse approximation ratio, on the other hand, closely mirrors this stability, tracking true utility much better than Recall@k. Ultimately, while Recall@k overstates the true cost of approximation, 1/Ratio@k offers a more accurate, deployable proxy for actual ANN quality.