Lossy Compression for Sparse Aggregation
2026-06-29 • Information Theory
Information Theory
AI summaryⓘ
The authors study how multiple clients can send small, sparse updates to a central server that combines them into one big model, like sharing puzzle pieces to complete a picture. They focus on balancing how much data needs to be sent versus how accurately the server can build the global model. To do this, they create a method that squishes these updates using two techniques: covering and sketching. They also develop a strong mathematical limit showing the best possible tradeoff in a simple case, but for more complex cases, their method isn't perfect yet and more research is needed.
distributed learningsparse modelcommunication costmodel aggregationcompression schemecovering methodsketchingf-divergenceFano's inequalityfrequency estimation
Authors
Yijun Fan, Fangwei Ye, Raymond W. Yeung
Abstract
We consider the problem of transmitting sparse local updates to the server in a distributed learning system. Specifically, the system consists of $n$ clients, each possessing a $k$-sparse $d$-dimensional local model, and a central server responsible for aggregating the clients' models into a global model. The goal is to characterize the tradeoff between the communication cost in the transmission from the clients to the server and the accuracy in aggregating the global model. We propose a compression scheme for sparse local models by concatenating a covering method and a sketching method. We also present a converse based on f-divergence, which strengthens the conventional Fano-type lower bounds. The proposed lower bound is tight for the frequency estimation case, that is, each coordinate takes values in a binary alphabet. For general alphabets, the proposed achievable schemes remain suboptimal relative to the converse bounds, indicating that a complete characterization of the communication-accuracy tradeoff requires further investigation.