AI summaryⓘ
The authors present HYPERPOSE, a new method for estimating 3D human poses that works by mapping body joints into a curved space called hyperbolic space, which better matches the branching structure of the human skeleton. Unlike traditional methods that use flat (Euclidean) space and can lose important geometric details, this approach preserves the relationships between joints more naturally. They introduce special attention mechanisms and training techniques to handle both spatial and time-based pose data while keeping the model stable and realistic. Tests on popular datasets show that their method improves accuracy and maintains better structure and movement consistency than existing models.
3D human pose estimationhyperbolic spaceLorentz modelspatio-temporal reasoningtransformersgraph convolutional networksvolume distortionRiemannian lossHuman3.6MMPI-INF-3DHP
Authors
Vinduja T., Ashish M., Ajay Waghumbare, Upasna Singh
Abstract
We introduce HYPERPOSE, a novel 3D human pose estimation framework that performs spatio-temporal reasoning entirely within the Lorentz model of hyperbolic space $\mathbb{H}^d$ to natively preserve the hierarchical tree topology of the human skeleton. Current state-of-the-art pose estimators aim to capture complex joint dynamics by relying on transformers and graph convolutional networks. Since these architectures operate exclusively in Euclidean space which fundamentally mismatches the inherent tree structure of the human body, these methods inevitably suffer from exponential volume distortion and struggle to maintain structural coherence. To this end, we depart from flat spaces and aim to improve geometric fidelity with Hyperbolic Kinematic Phase-Space Attention (HKPSA), natively embedding complex joint relationships without distortion, alongside a multi-scale windowed hyperbolic attention mechanism that efficiently models temporal dynamics in $O(TW)$ complexity. Furthermore, to overcome the well-known instability of training non-Euclidean manifolds, HYPERPOSE introduces a novel Riemannian loss suite and an uncertainty-weighted curriculum, enforcing physical geodesic constraints like bone length and velocity consistency. Extensive evaluations on the Human3.6M and MPI-INF-3DHP datasets demonstrate that HYPERPOSE achieves state-of-the-art structural and temporal coherence, significantly reducing both volume distortion and velocity error, while establishing new state-of-the-art benchmarks in overall positional accuracy.