Lossy Compression for Sparse Aggregation

2026-06-29 • Information Theory

Information Theory

AI summaryⓘ

The authors study how multiple clients can send small, sparse updates to a central server that combines them into one big model, like sharing puzzle pieces to complete a picture. They focus on balancing how much data needs to be sent versus how accurately the server can build the global model. To do this, they create a method that squishes these updates using two techniques: covering and sketching. They also develop a strong mathematical limit showing the best possible tradeoff in a simple case, but for more complex cases, their method isn't perfect yet and more research is needed.

distributed learningsparse modelcommunication costmodel aggregationcompression schemecovering methodsketchingf-divergenceFano's inequalityfrequency estimation

Authors

Yijun Fan, Fangwei Ye, Raymond W. Yeung

Abstract

We consider the problem of transmitting sparse local updates to the server in a distributed learning system. Specifically, the system consists of $n$ clients, each possessing a $k$-sparse $d$-dimensional local model, and a central server responsible for aggregating the clients' models into a global model. The goal is to characterize the tradeoff between the communication cost in the transmission from the clients to the server and the accuracy in aggregating the global model. We propose a compression scheme for sparse local models by concatenating a covering method and a sketching method. We also present a converse based on f-divergence, which strengthens the conventional Fano-type lower bounds. The proposed lower bound is tight for the frequency estimation case, that is, each coordinate takes values in a binary alphabet. For general alphabets, the proposed achievable schemes remain suboptimal relative to the converse bounds, indicating that a complete characterization of the communication-accuracy tradeoff requires further investigation.

View PDFOpen arXiv