Unsupervised Collaborative Domain Adaptation for Driving Scene Parsing

2026-06-01 • Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition

AI summaryⓘ

The authors address the challenge of adapting self-driving car vision systems to new environments without needing original training data or expensive new labels. They propose a method called UCDA that combines knowledge from multiple pre-trained models to improve scene understanding in new driving conditions. By comparing predictions and refining models together on unlabeled target data, the method creates a reliable final model for deployment. Tests show this approach helps autonomous vehicles better recognize their surroundings under various conditions.

domain adaptationunsupervised learningdriving scene parsingsource-free adaptationprototype memory bankmodel distillationcross-model consistencyautonomous vehiclessemantic segmentationmulti-source learning

Authors

Jiahe Fan, Shaolong Shu, Mingjian Sun, Tiehua Zhang, Bohong Xiao, Hanli Wang, Rui Fan

Abstract

Reliable driving scene parsing is a fundamental capability for autonomous vehicles operating in open and dynamic driving environments. However, adapting perception models to new deployment domains remains challenging because pixel-level annotations are expensive to obtain, while source-domain data are often inaccessible due to privacy, security, or ownership constraints. Existing source-free unsupervised domain adaptation methods typically rely on a single pre-trained source model, which makes the adapted perception system vulnerable to source-specific biases and limits its robustness under diverse road layouts, illumination conditions, weather patterns, and traffic conditions. This article presents an unsupervised collaborative domain adaptation (UCDA) framework for driving scene parsing in a source-free setting, which transfers complementary knowledge from multiple pre-trained source models to a unified target model without accessing any original source samples. To compare predictions from independently trained models, UCDA constructs a class-level prototype memory bank and estimates cross-model prediction reliability through prototype similarity, reducing the effect of inconsistent confidence scales across source models. Based on the resulting complementary supervision, UCDA adopts a two-stage transfer strategy: multiple source models are first refined on unlabeled target-domain driving data through collaborative optimization with positive and negative consistency constraints, and their validated expertise is then distilled into a single deployable target model. Comprehensive evaluations on public driving-scene datasets and real-world data collected from an autonomous vehicle platform demonstrate that UCDA effectively consolidates complementary multi-source knowledge, improving target-domain scene parsing reliability and generalization across diverse driving environments.

View PDFOpen arXiv