PS-MOT: Cultivating Instance Awareness from Point Seeds for Multi-Object Tracking

2026-06-29Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionRobotics
AI summary

The authors propose a new way to track multiple objects in videos using only points instead of detailed boxes around each object, which is cheaper to label. Their method, called PS-Track, improves tracking by turning these points into better guesses over time, imagining object outlines using a special attention module, and handling uncertainty in the labels. They tested their method on several datasets and found it works well for different tracking situations, making point-based tracking more practical.

Multi-Object TrackingPoint SupervisionTemporal-Feedback PromptingWavelet AttentionPseudo-labelsUncertainty-Guided LearningIdentity DriftTopological Representation
Authors
Kai Luo, Fei Teng, Mengfei Duan, Wanjun Jia, Xu Wang, Hao Shi, Kunyu Peng, Zhiyong Li, Kailun Yang
Abstract
We introduce Point-supervised Multi-Object Tracking (PS-MOT) as a cost-effective alternative to traditional bounding box supervision, shifting the focus from spatial fitting to topological center-driven representation. However, PS-MOT faces challenges, e.g., spatial ambiguity and identity drift due to the lack of explicit geometric structure and scale constraints. To address these, we propose PS-Track, a hierarchical pipeline transitioning from points to instances across data, model, and loss levels. At the data level, we introduce Temporal-Feedback Prompting (TFP) to evolve points into temporally consistent pseudo-labels using negative spatial cues and motion priors. At the model level, we design the Point-Excited Wavelet Attention (PEWA) module, which leverages semantic correlations to activate high-frequency components, ``hallucinating'' object boundaries. At the loss level, Uncertainty-Guided Gaussian Learning (UGL) models pseudo-labels as probabilistic distributions, dynamically calibrating supervision intensity. Experiments on DanceTrack, EmboTrack, SportsMOT, and JRDB demonstrate that PS-Track provides a feasible and effective point-supervised alternative across diverse tracking scenarios, establishing a new state-of-the-art for point-supervised tracking. The source code is available at https://github.com/xifen523/PS-MOT.