Suppressing Forgery-Specific Shortcuts for Generalizable Deepfake Detection

2026-06-01Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence
AI summary

The authors address the problem that deepfake detectors often fail to recognize new types of fake videos because they rely on specific quirks tied to known forgery methods. They propose a Shortcut Subspace Suppression (S³) approach that identifies these method-specific quirks by analyzing variations linked to each forgery technique. Their method reduces dependence on these quirks during training and also offers a way to adjust the model during testing for better generalization. Experiments show their approach improves detection on unseen forgery methods without hurting performance on familiar ones.

Deepfake detectionGeneralizationShortcut learningSubspace modelingSingular Value Decomposition (SVD)Linear probeFeature suppressionCross-method evaluationNeural network interpretability
Authors
Yihui Wang, Yonghui Yang, Jilong Liu, Fengbin Zhu, Le Wu, Tat-Seng Chua
Abstract
Deepfake detection suffers from poor generalization across forgery methods, as existing models tend to rely on spurious method-specific shortcuts that fail to transfer to unseen manipulations. While recent approaches attempt to improve generalization, they lack an explicit mechanism to identify and suppress such shortcuts in learned representations. In this work, we propose Shortcut Subspace Suppression (S^3) framework that explicitly characterizes and suppresses method-specific shortcuts via subspace modeling. Our key insight is that variations distinguishing different forgery methods capture method-specific artifacts and thus serve as an effective proxy for method-specific shortcuts. To this end, we train a lightweight linear probe for forgery method classification and perform Singular Value Decomposition (SVD) to extract the dominant shortcut subspace. Building on this formulation, we develop two complementary strategies to reduce shortcut reliance. During training, we softly suppress the shortcut subspace in feature representations, encouraging the model to rely on more generalizable cues for real/fake discrimination. At inference time, we introduce a training-free counterpart that attenuates neurons aligned with the identified shortcut directions, enabling plug-and-play generalization enhancement with improved interpretability. Extensive experiments on multiple benchmarks demonstrate that our method significantly improves cross-method generalization while maintaining strong in-domain performance. The code will be released upon acceptance of the submission.