Public Diffusion Models, Private Images: Key-Controlled Inversion for Conditional Reconstruction

2026-06-22Cryptography and Security

Cryptography and Security
AI summary

The authors study how to protect images generated by diffusion models when all model details are public, which normally lets attackers reverse-engineer original images. They propose adding special noise controlled by a secret key into the process, so only someone with the correct key can recover the input image. They prove mathematically that without the key, an attacker’s chance to get the original image is extremely small. Tests showed their method works well across different models and data, and doesn’t worsen issues caused by model differences. This approach helps keep generated images secure even when the model is openly accessible.

diffusion modelsimage inversionwhite-box settingkey-controlled noiseerror propagationIND-CPA securityprobabilistic polynomial-time adversarygenerative modelsopen-source checkpoints
Authors
Lijunxian Zhang, Weihai Li, Bin Liu, Zikai Xu
Abstract
Diffusion models are often deployed in settings where model parameters are publicly accessible (e.g., open-source libraries or released checkpoints). This white-box scenario creates a serious security risk: any user who obtains an intermediate latent representation can invert the process to recover the original input image. Most prior work on access control for generative models assumes a black-box model (i.e., parameters are kept secret), typically under an honest-but-curious adversary. By contrast, we address the more challenging and realistic white-box setting where all parameters are public. We present a key-controlled inversion framework that turns the inherent error propagation of diffusion models, which exponentially amplifies small perturbations, into a security asset. By injecting key-dependent noise into the inversion formula, we ensure that only a user with the correct key can reconstruct the original image; any other key yields unrecognizable output. Theoretically, by leveraging existing error-propagation theory for diffusion models, we prove that the resulting ciphertext distribution is IND-CPA secure and derive that the adversary's advantage is exponentially small in a tunable security parameter, hence negligible for any probabilistic polynomial-time (PPT) adversary. Experimentally, we validate these security guarantees across several models and datasets and further demonstrate cross-model robustness, that the injected key noise does not amplify the performance drop caused by model discrepancies.