Proofs of Ownership for Machine Learning Models

2026-06-29 • Machine Learning

Machine LearningCryptography and Security

AI summaryⓘ

The authors study how to prove that a machine learning model truly belongs to its original creator, especially when someone else might steal and change it slightly. They set up a game involving a model owner, a thief who modifies the model, and a judge who decides if a model is stolen or independently made. Their main finding is that under typical cryptographic rules, such proof is possible if and only if the type of model cannot be self-corrected, meaning it can't be fixed or reconstructed easily from its outputs. This result helps clarify when ownership proofs for models are feasible and applies to several related scenarios.

Proof of OwnershipMachine Learning ModelModel TheftBlack-box SettingCryptographic AssumptionsConcept ClassSelf-correctabilityBlum Luby RubinfeldModel VerificationModel Perturbation

Authors

Ran Canetti, Shafi Goldwasser, Or Zamir

Abstract

With the increasing adoption of Machine Learning, protecting model ownership has become an essential challenge. We initiate a formal study of Proof of Ownership for machine learning models: under what conditions can one prove that a stolen model originated from a particular creator? We model proofs of ownership as a game among three parties: a model owner, a thief, and a judge. The owner transforms the original model into a slightly perturbed model together with a proof of ownership. The thief then obtains the transformed model and attempts to minimally modify it so that it remains useful but escapes detection as owned by the model owner. Finally, the judge receives a model and a proof of ownership, and must decide whether the given model is a modified version of some model created by the model owner, or else the given model was developed independently. Our main result is a dichotomy for classifiers in the black-box setting: Under standard cryptographic assumptions, ownership of models for some concept class can be proven in the above sense {\em if and only if} the concept class is not self-correctable, in a sense close to that of Blum, Luby and Rubinfeld, STOC'90. The result is constructive and extends, with some variations, to a number of related settings.

View PDFOpen arXiv