A Watermark for Vision-Language-Action and World Action Models
2026-06-22 • Cryptography and Security
Cryptography and SecurityRobotics
AI summaryⓘ
The authors developed a method to secretly mark robot control models that turn camera input into actions, so owners can prove which model is theirs without changing how the robot behaves. They do this by replacing random noise used inside the model with a special keyed noise that looks normal but serves as a hidden signature. Later, by observing the robot's actions, the owner can recover this key and confirm if the model is theirs, even if someone tries to hide or alter the model. Tests show their method reliably detects the fingerprint without hurting performance and resists common ways to remove it.
Vision-language-action modelsWorld-action modelsFingerprintingGaussian noise seedRobot controlMaximum a posteriori optimizationModel verificationBlack-box serviceWatermarkingAdversarial attacks
Authors
Yule Liu, Shuai Liu, Jiaheng Wei, Xinlei He
Abstract
Vision-language-action (VLA) models and world-action models (WAM) are the generative models now driving general-purpose robot control, turning raw camera input directly into motor commands. They are increasingly deployed as black-box services, where a partner runs the policy through an interface while the owner keeps the weights private. Training such a model takes proprietary data and heavy computational power, making the deployed model itself a valuable intellectual property. To address this, we propose the \emph{keyed latent-provenance verification} method, which fingerprints the policy through the seed of the Gaussian noise vector that the models draw before generation. At the injection stage, the owner swaps this seed for a keyed one with the same distribution as ordinary noise, so the fingerprinted actions are statistically identical to those of an ordinary run and an adversary watching the output finds no signal to detect or remove. At the verification stage, the owner runs the suspect model under authorized access and records the action channels the robot executes, a partial and possibly post-processed view of the policy's output. From this view, the verifier recovers the seed by gradient-based maximum a posteriori (MAP) optimization, tests it for the secret key to score each rollout, and aggregates these scores into a single decision on whether the suspect model belongs to the owner. We evaluate the method on two representative models across two robot suites. The experiments cover detection of the fingerprint, identification of which of several keys a suspect carries, robustness to a range of attacks, and an analysis of why the design works. Across both models, the fingerprint can be detected reliably with little change to task performance, and it remains detectable under output-side removal attacks and weight-level edits.