Understanding Identity Continuity in Thermal Video through Scene-Level Consistency

2026-06-01Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial IntelligenceMachine LearningMultimedia
AI summary

The authors looked at how to better track people in thermal videos, where it's hard to tell identities because of weak visual details. Instead of using complicated methods, they added a simple extra step that fixes broken tracking by linking short breaks and reconnecting parts of tracks based on timing and movement. Their tests showed this approach improved tracking accuracy while keeping error rates low. They found that checking overall scene consistency is more helpful for maintaining identities than just focusing on frame-by-frame matching.

Thermal pedestrian trackingMultiple object tracking (MOT)YOLOv8SORT algorithmIdentity continuityTracklet relinkingIDF1 scoreMOTASpatial-temporal consistencyRe-identification
Authors
Wei-Chieh Sun, Gyungmin Ko, Heejae Kwon, Hsiang-Wei Huang, Jenq-Neng Hwang
Abstract
Thermal pedestrian MOT remains challenging because weak appearance cues and frequent detection interruptions cause severe trajectory fragmentation. We study whether lightweight post-processing can recover identity continuity without relying on heavy re-identification models or complex online association. Starting from a YOLOv8 and SORT baseline, we add a modular identity-repair backend consisting of online short-gap remapping and offline tracklet relinking based on temporal, spatial, motion, and border cues. Controlled ablations on a fixed validation split and evaluation on the official PBVS Thermal Pedestrian MOT benchmark show that the main identity gains arise from conservative relinking, improving IDF1 from 82.25 to 84.93 while preserving MOTA, whereas many heuristic thresholds remain stable across broad operating ranges. These results suggest that, in low-information thermal imagery, robust identity recovery can be achieved more effectively through high-precision trajectory relinking than through increasing tracker complexity. These results provide a controlled analysis of identity recovery in thermal video, showing that scene-level spatial-temporal consistency plays a dominant role in identity continuity compared to local frame-to-frame association.