GroundingAnomaly: Spatially-Grounded Diffusion for Few-Shot Anomaly Synthesis
2026-04-09 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors focus on improving how machines spot unusual defects in factory-made products, which is difficult because there aren't many examples of defects to learn from. They created GroundingAnomaly, a new way to generate fake defect images that look realistic and are precisely controlled in where the defects appear. Their method uses special modules to carefully add these defects without messing up what the model already knows. When tested on standard datasets, their system performed very well in finding, outlining, and identifying defects in images.
anomaly detectionfew-shot learningimage synthesissemantic mapsU-Netgated attentionanomaly segmentationindustrial quality controlimage inpaintingMVTec AD dataset
Authors
Yishen Liu, Hongcang Chen, Pengcheng Zhao, Yunfan Bao, Yuxi Tian, Jieming Zhang, Hao Chen, Zheng Zhi, Yongchun Liu, Ying Li, Dongpu Cao
Abstract
The performance of visual anomaly inspection in industrial quality control is often constrained by the scarcity of real anomalous samples. Consequently, anomaly synthesis techniques have been developed to enlarge training sets and enhance downstream inspection. However, existing methods either suffer from poor integration caused by inpainting or fail to provide accurate masks. To address these limitations, we propose GroundingAnomaly, a novel few-shot anomaly image generation framework. Our framework introduces a Spatial Conditioning Module that leverages per-pixel semantic maps to enable precise spatial control over the synthesized anomalies. Furthermore, a Gated Self-Attention Module is designed to inject conditioning tokens into a frozen U-Net via gated attention layers. This carefully preserves pretrained priors while ensuring stable few-shot adaptation. Extensive evaluations on the MVTec AD and VisA datasets demonstrate that GroundingAnomaly generates high-quality anomalies and achieves state-of-the-art performance across multiple downstream tasks, including anomaly detection, segmentation, and instance-level detection.