RIRF: Reasoning Image Restoration Framework

2026-04-10Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors present a new method called Reason and Restore (R&R) for fixing images damaged in many unknown ways using one model. Their method first 'thinks through' what kind of damage is present, how severe it is, and what is in the scene before trying to fix the picture. This reasoning helps the fixing part work better and also makes it easier to understand what the model is doing. They show that R&R improves image restoration performance and explains its process better than previous methods.

Universal Image RestorationDegradation DiagnosisChain-of-Thought ReasoningQwen3-VLReinforcement LearningImage RestorationMultimodal ModelsSemantic UnderstandingPixel Reconstruction
Authors
Wending Yan, Rongkai Zhang, Kaihua Tang, Yu Cheng, Qiankun Liu
Abstract
Universal image restoration (UIR) aims to recover clean images from diverse and unknown degradations using a unified model. Existing UIR methods primarily focus on pixel reconstruction and often lack explicit diagnostic reasoning over degradation composition, severity, and scene semantics prior to restoration. We propose Reason and Restore (R\&R), a novel framework that integrates structured Chain-of-Thought (CoT) reasoning into the image restoration pipeline. R\&R introduces an explicit reasoner, implemented by fine-tuning Qwen3-VL, to diagnose degradation types, quantify degradation severity, infer key degradation-related factors, and describe relevant scene and object semantics. The resulting structured reasoning provides interpretable and fine-grained diagnostic priors for the restorer. To further improve restoration quality, the quantified degradation severity produced by the reasoner is leveraged as reinforcement learning (RL) signals to guide and strengthen the restorer. Unlike existing multimodal LLM-based agentic systems that decouple reasoning from low-level vision tasks, R\&R tightly couples semantic diagnostic reasoning with pixel-level restoration in a unified framework. Extensive experiments across diverse UIR benchmarks demonstrate that R\&R achieves state-of-the-art performance while offering unique interpretability into the restoration process.