Revisiting Vehicle Color Recognition in Long-Tailed Surveillance Scenarios

2026-06-11 • Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition

AI summaryⓘ

The authors studied how to improve vehicle color recognition in surveillance videos, especially when some colors are rare and hard to identify. They used new computer-generated images to help teach their system about these uncommon colors. By combining these images with advanced training techniques, their method got better at recognizing all colors, including rare ones. They also found that some mistakes are very hard to avoid because even people sometimes can't tell the colors clearly. The authors shared their code and generated images for others to use.

vehicle color recognitionclass imbalancesynthetic data augmentationtext-conditioned image generationimage-conditioned color editingloss reweightinglearning-rate schedulingforeground-aware preprocessingensemble fusion

Authors

Vinícius Orrú, Bruno H. Foggiatto, Gabriel E. Lima, David Menotti, Rayson Laroca

Abstract

Vehicle color recognition is an important cue for vehicle identification in surveillance systems, especially when license plates are illegible due to low resolution, occlusion, motion blur, or poor illumination. However, real-world vehicle color distributions are highly imbalanced, making overall accuracy insufficient to assess performance on rare but operationally relevant colors. This paper presents a comprehensive study of vehicle color recognition under severe class imbalance using UFPR-VeSV, a challenging real-world surveillance dataset. We investigate synthetic minority-class augmentation through two off-the-shelf generative strategies: text-conditioned image generation with RunDiffusion/JuggernautXL and image-conditioned color editing with Gemini 2.0 Flash. The curated synthetic data are combined with modern visual representations, loss reweighting, learning-rate scheduling, color-safe augmentation, foreground-aware preprocessing, and ensemble fusion. The bestperforming approach achieves 94.6% micro accuracy and 79.7% macro accuracy, improving macro accuracy by 8.2 percentage points over recent literature. A manual error analysis further shows that many remaining failures are visually ambiguous even for human annotators, highlighting the practical limits of color-based vehicle identification in unconstrained surveillance imagery. The generated images and source code are publicly available at https://github.com/viniciusorru/vcr-synthetic

View PDFOpen arXiv