Engineering Practical Succinct Bit Vectors: A Space-Time Pareto Analysis on Apple Silicon ARM64 Cores

2026-05-25Data Structures and Algorithms

Data Structures and Algorithms
AI summary

The authors studied different ways to quickly answer rank and select queries on compressed bit vectors, which store data using very little space. They tested three methods on Apple Silicon chips and found one method improved speed by 1.4 times using uneven block sizes. They also noticed that one compressed method (RRRBitVec) is much faster with certain data patterns and significantly speeds up select queries by using smart indexing strategies. Their implementations were thoroughly tested for correctness with no errors, and the code is publicly available.

succinct data structuresbit vectorsrank queriesselect queriesRRR codingApple Siliconcompressionsuperblockindexingcache optimization
Authors
Ishant Garg
Abstract
Succinct data structures use space close to the information-theoretic minimum while answering queries directly on the compressed representation. In this paper, we present a practical engineering study of rank and select queries on bit vectors. We evaluate a classic two-level block baseline (BlockBitVec), an asymmetric superblock implementation (FastBitVec), and an entropy-compressed representation (RRRBitVec) based on the Raman, Raman, and Rao (RRR) coding scheme. On Apple Silicon (M-series ARM architecture), we demonstrate a 1.4x speedup in rank queries through asymmetric 4096/256-bit block boundaries, with a rank index overhead of 7.8%. We investigate the empirical behavior of RRRBitVec and observe a symmetric density-dependent bell-curve for rank latency -- where queries at extreme densities (1% and 99%) run up to 39% faster due to offset elimination at boundary classes. We further show that RRRBitVec achieves a 4.9x speedup over classic binary-search select baselines, running in 33.7 ns at uniform density by using a superblock-level sampling index that restricts sequential scans to L1-cache lookups. All implementations are validated against a correctness fuzzer executing over 78 million assertions with no failures. Source code and test harnesses are publicly available.