T-VSS: Test-Time Visual Subspace Steering for Adversarial Robustness of Vision-Language Models
2026-06-22 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors address how vision-language models (VLMs), which can recognize images without training on them, are easily fooled by small image changes called adversarial attacks. They point out that existing methods try to fix this by changing text prompts or image pixels after receiving a corrupted image, but these methods are indirect and slow. Their method, Test-time Visual Subspace Steering (T-VSS), adjusts the image features directly in a smart, limited space to correct the attack effects more efficiently. Tests show T-VSS makes models more robust to attacks without losing accuracy on normal images and is faster than previous approaches.
vision-language modelsadversarial perturbationstest-time adaptationfeature spaceentropy minimizationsubspaceImageNetrobustnessfine-grained recognitionzero-shot learning
Authors
Jaehyuk Jang, Minseok Seo. Seungju Cho, Kangwook Ko, Changick Kim
Abstract
Vision-language models (VLMs) achieve strong zero-shot recognition, but they remain highly vulnerable to adversarial perturbations. Recent test-time adaptations improve robustness without retraining, but they do not directly adapt the corrupted visual representation itself. Prompt-based methods adapt the learnable text prompts, while input-space methods optimize pixels or padding at test time. These approaches can improve predictions, but they do so through an indirect and expensive optimization path. We propose Test-time Visual Subspace Steering (T-VSS), a lightweight defense that performs test-time adaptation directly in the visual feature space. T-VSS first builds a sample-specific low-rank subspace from multi-view feature residuals anchored at the attacked image. It then learns a shared feature correction within this subspace using reliability-weighted entropy minimization. By constraining adaptation to a compact visual geometry, T-VSS steers attacked features toward more stable and discriminative predictions while avoiding noisy full-space updates. Experiments on fine-grained, ImageNet, and ImageNet-OOD benchmarks show that T-VSS improves adversarial robustness while maintaining competitive clean accuracy and better efficiency than prior test-time adaptations.