FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences

2026-06-02 • Machine Learning

Machine LearningArtificial IntelligenceCryptography and Security

AI summaryⓘ

The authors explain that how a large language model (LLM) behaves depends not just on its core makeup but also on how it’s set up each time it’s used. This means a model might be safe in one setup but harmful in another. Current methods to identify LLMs focus on what the model is, not how it’s being used, which makes it hard to check if the model follows rules in real usage. The authors propose a new way called instance-level fingerprinting that can tell apart different setups of the same LLM with high accuracy. Their method, FLIPS, works well even when some setups are unknown, making regulation more possible.

Large Language Modelinstance-level parametersinstructional promptsampling configurationquantizationfingerprintingAI regulationmodel identificationFLIPS methodbehavioral compliance

Authors

Gurvan Richardeau, Gohar Dashyan, Erwan Le Merrer, Gilles Tredan

Abstract

Literature reveals that a Large Language Model's (LLM) behavior is not only conditioned by its original weights but also its instance-level parameters, such as instructional prompt, sampling configuration or quantization. A model that generates safe outputs under one configuration may produce toxic content under another. However, current LLM identification techniques (such as fingerprinting) focus on intellectual property protection, and their design favors robustness to changes in these instance-level parameters. This poses a critical challenge for AI regulation in which compliance assessments target actual deployed behaviors, not model provenance. In this paper, we introduce instance-level fingerprinting, a regulator-oriented paradigm that distinguishes configurations of the same LLM. Our method FLIPS, exploits biases in generated binary random sequences to reach 96% (closed-set) and 90% (open-set, where some targets are unknown) identification accuracy across 237 model instances, versus 35% for the adapted LLMmap baseline. This shows that instance-level fingerprinting is both necessary for regulation and practically feasible. Code available at https://github.com/GurvanR/FLIPS-LLM-Instance-Fingerprinting.

View PDFOpen arXiv