Privacy-Preserving Person Re-Identification from Temporal Sequences with Transformer and Hungarian Optimization
2026-06-22 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors developed a new way to recognize people by using depth images, which show shapes and movement but hide faces to protect privacy. They use a special matching method called the Hungarian algorithm to connect different views of the same person. Their model also looks at sequences of frames with a Transformer to learn how people move, combining color and depth data. Testing on various datasets showed that their depth-only method works well compared to other approaches, while being better for privacy.
Person Re-IdentificationDepth ImagesHungarian AlgorithmTransformer EncoderBatch Hard Triplet LossRGB-D DataTop-View DatasetsCumulative Matching Characteristics (CMC)Mean Average Precision (mAP)
Authors
Raphaël Delécluse, Hazem Wannous, Laurent Guimas
Abstract
Person re-identification (Re-ID) is a crucial task in surveillance and human behavior analysis, often used in public spaces such as transport hubs. Traditional RGB-based Re-ID methods raise privacy concerns and are highly sensitive to lighting variations and occlusion. In this paper, we propose a novel Re-ID approach that leverages depth images, which inherently obscures facial and other identifiable features, making it a privacy-preserving solution. Our method addresses the association problem between multiple views of individuals by applying the Hungarian algorithm, optimizing the matching process through minimization of the global cost across the distance matrix. We further enhance the approach by introducing temporal sequences of frames as input to a Transformer encoder architecture, which exploits both RGB and depth modalities. This architecture captures dynamic movement patterns, improving feature extraction and re-identification accuracy. Additionally, we employ batch hard triplet loss to enhance discriminative feature learning by focusing on the hardest samples. We evaluate both depth-only and RGB-D models on several top-view datasets, including TVPR2, GODPR, and BIWI RGBD-ID. Our results demonstrate that depth-only re-identification can achieve competitive performance compared to state-of-the-art methods, as measured by standard metrics such as Cumulative Matching Characteristics (CMC) and Mean Average Precision (mAP), while prioritizing privacy preservation.