Frequency-Enhanced Diffusion Models: Curriculum-Guided Semantic Alignment for Zero-Shot Skeleton Action Recognition

2026-04-10Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence
AI summary

The authors tackle the problem of recognizing human actions from skeleton data without needing labeled examples for every action, which is hard to do with standard methods. They identify that current models struggle to capture quick, detailed movements because these models tend to smooth out important motion details. To fix this, the authors propose a new method called FDSM that helps the model pay attention to fine motion details and better match skeleton movements to text descriptions. Their method improves how well the system recognizes unseen actions and works better than previous ones on several popular datasets.

Human Action RecognitionZero-Shot LearningSkeleton-Based MethodsDiffusion ModelsSpectral BiasFrequency AnalysisSemantic EmbeddingNTU RGB+D DatasetPKU-MMD DatasetKinetics Skeleton Dataset
Authors
Yuxi Zhou, Zhengbo Zhang, Jingyu Pan, Zhiyu Lin, Zhigang Tu
Abstract
Human action recognition is pivotal in computer vision, with applications ranging from surveillance to human-robot interaction. Despite the effectiveness of supervised skeleton-based methods, their reliance on exhaustive annotation limits generalization to novel actions. Zero-Shot Skeleton Action Recognition (ZSAR) emerges as a promising paradigm, yet it faces challenges due to the spectral bias of diffusion models, which oversmooth high-frequency dynamics. Here, we propose Frequency-Aware Diffusion for Skeleton-Text Matching (FDSM), integrating a Semantic-Guided Spectral Residual Module, a Timestep-Adaptive Spectral Loss, and Curriculum-based Semantic Abstraction to address these challenges. Our approach effectively recovers fine-grained motion details, achieving state-of-the-art performance on NTU RGB+D, PKU-MMD, and Kinetics-skeleton datasets. Code has been made available at https://github.com/yuzhi535/FDSM. Project homepage: https://yuzhi535.github.io/FDSM.github.io/