Off-the-shelf Vision Models Benefit Image Manipulation Localization

2026-04-10Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionMultimedia
AI summary

The authors noticed that tasks which find image manipulations and general vision tasks like segmentation usually work separately because they focus on different features. They suggest that using general knowledge about images can actually help detect manipulations better. To do this, they created a small add-on called ReVi that can be attached to existing vision models without retraining them entirely. This add-on helps separate everyday image features from manipulation clues, improving manipulation detection efficiently.

Image Manipulation LocalizationGeneral Vision TasksSemantic FeaturesTrainable AdapterReViRobust Principal Component AnalysisImage SegmentationModel Fine-tuningOff-the-shelf Models
Authors
Zhengxuan Zhang, Keji Song, Junmin Hu, Ao Luo, Yuezun Li
Abstract
Image manipulation localization (IML) and general vision tasks are typically treated as two separate research directions due to the fundamental differences between manipulation-specific and semantic features. In this paper, however, we bridge this gap by introducing a fresh perspective: these two directions are intrinsically connected, and general semantic priors can benefit IML. Building on this insight, we propose a novel trainable adapter (named ReVi) that repurposes existing off-the-shelf general-purpose vision models (e.g., image generation and segmentation networks) for IML. Inspired by robust principal component analysis, the adapter disentangles semantic redundancy from manipulation-specific information embedded in these models and selectively enhances the latter. Unlike existing IML methods that require extensive model redesign and full retraining, our method relies on the off-the-shelf vision models with frozen parameters and only fine-tunes the proposed adapter. The experimental results demonstrate the superiority of our method, showing the potential for scalable IML frameworks.