Closed-Loop Bidirectional Prompting for Adversarial Robustness of Vision Language Models

2026-05-25Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors found that Vision Language Models, which connect images and text, can be easily tricked by small changes that confuse their understanding. To fix this, they created a method called Closed-Loop Bidirectional Prompting that uses a back-and-forth feedback process between image and text parts of the model to recover reliable meanings. They introduced a stable 'Semantic Anchor' to help keep this process steady despite attacks. Their tests on many datasets showed this approach makes the models more robust and adaptable without using too much extra computing power.

Vision Language ModelsAdversarial PerturbationsCross-modal Semantic AlignmentPromptingSemantic AnchorFeature CorruptionBootstrappingRobustnessFrozen EncodersInstance-adaptive Updating
Authors
Xiao Liu, Jiaxiang Liu, Boci Peng, Boren Hu, Yusong Wang, Xiwen Chen, Prayag Tiwari, Liming Zhang, Mingkun Xu
Abstract
Vision Language Models adapt well to downstream tasks but are highly vulnerable to adversarial perturbations that disrupt cross-modal semantic alignment. Existing defenses are largely unidirectional or structural, failing to exploit bidirectional cross-modal complementarity and instance-wise adaptive protection. To overcome the limitations of unidirectional and static defenses in adversarial settings, we propose Closed-Loop Bidirectional Prompting, casting robust adaptation as cross-modal agreement recovery via a dynamic feedback loop on frozen encoders. A Semantic Anchor is introduced as a stable prior to constrain cyclic updates and mitigate perturbation-induced feature corruption. Through anchor-based bootstrapping, textual semantics denoise visual representations, while the refined visuals enable instance-adaptive prompt updating, yielding a rectified and robust consensus. Extensive evaluations across 11 datasets validate state-of-the-art robustness and strong base-to-new generalization, while maintaining a favorable trade-off between computational cost and accuracy.