Engineering Scalable Distributed List Ranking

2026-06-08Distributed, Parallel, and Cluster Computing

Distributed, Parallel, and Cluster ComputingData Structures and Algorithms
AI summary

The authors revisit a classic problem in parallel computing called list ranking, which has been less explored in recent years. They improved an old algorithm by Sibeyn to handle much larger data and many processors more efficiently. Their work includes a detailed study of how changing certain settings affects performance. Through extensive testing, they show that smart communication strategies help the algorithm work well even on huge input sizes and many processors.

list rankingparallel computingalgorithm engineeringruling-set algorithmperformance scalinginput localitymessage coalescingexperimental studydistributed memoryprocessor cores
Authors
Peter Sanders, Matthias Schimek, Tim Niklas Uhl, Thomas Weidmann
Abstract
The list ranking problem is one of the classical problems of parallel computing, with nontrivial algorithms and many applications as a subroutine for solving other problems. While it has been intensively studied in the early days of parallel computing, few things happened in the last 20 years. In particular, there is little work on scaling list ranking to large machines and input sizes. We reconsider list ranking starting from the ground-breaking results of Sibeyn a quarter century ago. We employ algorithm and performance engineering to improve his sparse ruling-set algorithm, making it capable of scaling to many processors, and provide a more detailed analysis of the impact of the algorithm's parameters, further guiding our practical implementation. We perform an extensive experimental study across a variety of input instances with different structural properties. We demonstrate that indirect communication, exploiting input locality, and message coalescing allows scaling to billions of elements on up to 24,576 cores.