Reconstruction Limits for Repeated Differentially Private Aggregates: A Cramer-Rao Perspective on Query Geometry

2026-06-17 • Information Theory

Information Theory

AI summaryⓘ

The authors studied how privacy is affected when the same data is repeatedly released with some noise added to protect it. They found that just counting how many times data is released or how much privacy is spent doesn't fully explain how much an attacker can learn. Instead, the structure of the data releases and how new information emerges after ignoring irrelevant parts matters more. They demonstrated this with examples showing that extra releases help only if they reveal new unique information beyond noise and that different privacy accounting methods impact how much risk remains. Their work gives a more precise way to understand privacy risks in repeated data sharing scenarios.

differential privacyGaussian noiseFisher informationprivacy accountingCramer-Rao boundBasic CompositionzCDPRDPstatistical querieslocal reconstruction risk

Authors

Chenyue Zhang, Andrew Campbell, Anna Scaglione, Sean Peisert

Abstract

Repeated differentially private (DP) releases are often evaluated by transcript length or cumulative privacy accounting. We show that these quantities do not by themselves determine local reconstruction risk. For Gaussian-calibrated repeated statistical queries, the key object is the nuisance-profiled Fisher geometry of the release sequence: repetition helps only when new releases create identifiable directions after nuisance variables are removed. Thus, release geometry determines what can be locally identified, while the privacy accountant determines how precisely those directions can be estimated. We develop this principle in two settings. For labeled-target reconstruction with fixed-background IN/OUT averages, repeated copies collapse to a single target-versus-background contrast. The best linear unbiased estimator attains the Cramer-Rao bound, and additional copies provide only averaging gain; under Basic Composition this gain is dominated by the $Θ(L^2\log L)$ noise penalty, whereas zCDP/RDP-style Gaussian accounting makes the risk order-flat. For static permutation-invariant releases, labels remain unidentified, but feature diversity can make the sorted participating multiset locally identifiable. For polynomial moments and smooth thresholds, the useful number of releases is governed by the balance between newly exposed eigendirections and accountant-induced noise growth. These results provide a local, mechanism-specific benchmark for value leakage in repeated private sensing and analytics.

View PDFOpen arXiv