Unsupervised Continual Clustering via Forward-Backward Knowledge Distillation

2026-06-05Machine Learning

Machine Learning
AI summary

The authors address the problem of teaching neural networks to group data into clusters without labels and without forgetting old groups as new data arrives. They propose a method called FBCC, which uses a teacher-student setup to help the model learn new clusters while remembering old ones without storing past data. Their approach is designed specifically for clustering tasks and avoids memory and privacy issues seen in other methods. Tests on several datasets show that their method reduces forgetting and improves clustering accuracy compared to previous methods.

Unsupervised Continual LearningCatastrophic ForgettingKnowledge DistillationClusteringTeacher-Student NetworkReplay BuffersContinual LearningClustering ProjectorUnsupervised LearningSequential Tasks
Authors
Mohammadreza Sadeghi, Sareh Soleimani, Zihan Wang, Narges Armanfard
Abstract
Unsupervised Continual Learning (UCL) aims to enable neural networks to learn sequential tasks without labels or access to past data. A major challenge in this setting is Catastrophic Forgetting, where models forget previously learned tasks upon learning new ones. This challenge is amplified in UCL due to the absence of labels to guide learning and memory retention. Existing mitigation strategies, such as knowledge distillation and replay buffers, often raise memory and privacy concerns. Moreover, current UCL methods largely overlook clustering-specific objectives. To fill this gap, we introduce Unsupervised Continual Clustering (UCC) and propose Forward-Backward Knowledge Distillation for Continual Clustering (FBCC). FBCC employs a continual teacher network with a clustering projector and lightweight task-specific students. Through a dual-phase forward-backward distillation process, the teacher learns new clusters while preserving previously discovered cluster structure without storing past data. FBCC represents a pioneering approach to UCC, demonstrating improved clustering performance across sequential tasks. Experiments on four benchmark datasets demonstrate that FBCC consistently outperforms existing continual learning baselines in clustering accuracy while significantly reducing catastrophic forgetting.