C^2: Cache-Conscious Succinct Tries with Adaptive Unary Path Compression

2026-06-15Databases

DatabasesData Structures and Algorithms
AI summary

The authors improved special data structures called succinct tries, which store strings using little memory and allow fast searching. They tackled two main problems: slow memory access during searches and wasted space when the data has many single-child paths. Their methods, called C1 and C2, make tries more memory-friendly and compress repetitive paths. Testing on different datasets showed their improvements speed up queries and reduce memory use compared to original versions and other methods.

succinct triesstring dictionariescache missesunary pathsdata compressionmemory footprintquery performanceFSTCoCo-trieMarisa
Authors
Kepan Zhang, Tiancheng Zhao, Helen Xu
Abstract
Succinct tries are powerful string dictionaries because of their low memory footprint and fast query performance. However, existing succinct trie implementations face two key challenges to spatial locality: 1) they incur unnecessary cache misses during queries, especially during trie navigation operations, and 2) they waste significant space when the data contains many unary paths. We propose C^2, a set of two techniques: C_1 introduces a more cache-friendly layout for the \bv underlying succinct tries, and C_2 compresses redundant unary paths. We thoroughly redesign three state-of-the-art succinct tries: FST, CoCo-trie, and Marisa, producing C^2-FST, C^2-CoCo, and C^2-Marisa. Experiments on six diverse datasets show that the C_1 optimization improves query performance by 1.58x, 1.12x, and 1.42x, respectively, compared to the original FST, CoCo-trie, and Marisa. Furthermore, the C_2 optimization achieves a 1.3x smaller memory footprint on average. The succinct tries optimized with both aspects of C^2 achieve better space-time tradeoffs than their original versions and other state-of-the-art succinct tries, while using significantly less space than non-succinct tries like ART and C-ART.