CLIP: Lightweight Cosine-Law-Based Inverted-List Pruning for IVF-Based Vector Search
2026-06-29 • Databases
Databases
AI summaryⓘ
The authors address the problem of slow search in vector-based retrieval systems caused by checking too many groups and items during queries. They propose CLIP, a new pruning method that quickly skips irrelevant groups and items using a simple cosine-based rule, improving speed without much extra cost. They create enhanced versions of existing search structures (IVF-CLIP and HIVF-CLIP) and a new system (LSM-IVF) that handles updates efficiently while keeping queries fast. Their experiments show that these methods greatly reduce unnecessary checks and improve search speeds compared to previous approaches.
vector searchinverted file (IVF)cosine similaritypruningmultimodal retrievalIVFFlathierarchical indexingupdate efficiencyLSM treequery latency
Authors
Yitong Song, Shuhang Lu, Xuanhe Zhou, Pengcheng Zhang, Jianliang Xu
Abstract
Vector search has become a core component of modern multimodal retrieval systems. Among existing methods, inverted file (IVF)-based methods are widely adopted due to their scalability, efficient updates, and hardware friendliness. However, they are fundamentally limited by coarse-grained execution: each query typically probes many clusters and exhaustively scans all vectors within them, resulting in high query latency. Prior works mitigate this using pruning strategies, but they often incur substantial extra pruning overhead, lack cluster-level pruning, and compromise update efficiency due to heavy maintenance of pruning metadata. This paper proposes CLIP, a lightweight cosine-law-based pruning technique that supports both inter- and intra-cluster pruning, substantially reducing unnecessary cluster and vector accesses with negligible overhead. First, CLIP exploits the monotonicity of cosine-law-based lower bounds, enabling eliminating an undesirable cluster in O(1) time and filtering batches of irrelevant vectors in logarithmic time in the list size, with a tight analytical guarantee. Second, building on this, we develop two IVF variants: IVF-CLIP, which integrates CLIP into IVFFlat, and HIVF-CLIP, which extends it with a hierarchical structure for adaptive sub-cluster probing. Third, for dynamic workloads, we present LSM-IVF, an LSM-inspired design that supports fast updates by deferring index maintenance to background compaction, and enables efficient queries via CLIP-based optimizations that eliminate costly level-by-level searches. Extensive experiments show that CLIP variants achieve up to 78% pruning and 69% higher efficiency over static IVF baselines, while LSM-IVF improves throughput by up to 141% over dynamic IVF baselines with comparable update efficiency.