Global Sketch-Based Watermarking for Diffusion Language Models

2026-06-03Cryptography and Security

Cryptography and SecurityComputation and LanguageMachine Learning
AI summary

The authors study how to hide watermarks in texts produced by diffusion-based language models, which generate multiple parts of a sentence at once rather than word-by-word. They propose a method that creates a special global “sketch” or summary of the whole text to embed a watermark, instead of adjusting the choice of each word based on prior words. This approach makes detecting the watermark independent of the exact word order and local contexts. They also analyze how much their watermark changes the text, how reliably it can be detected, and how resistant it is to changes.

watermarkinglanguage modelsautoregressive modelsdiffusion modelsmasked language modelstoken distributionglobal sketchtext generationdetection robustnesssequence sampling
Authors
Daniel Zhao
Abstract
Watermarking methods for language models have been studied extensively in the autoregressive setting, where tokens are generated sequentially. These works largely focus on local-context schemes that perturb the next token's distribution as a function of its preceding tokens. In diffusion language models, distributions over many unresolved positions are jointly sampled, allowing additive statistics of the entire sequence to be tractable during generation. We propose a watermark for masked diffusion language models that controls a global, vector-valued sketch representation of the text. Compared to context-dependent watermarking, the sketch formulation decouples detection from the local contexts seen during generation, resulting in an order-agnostic statistic and a watermarking rule which does not manifest as a simple token bias. We analyze the distortion, soundness, and robustness properties of the method.