Ultra Diffusion Poser: Diffusion-Based Human Motion Tracking From Sparse Inertial Sensors and Ranging-Based Between-Sensor Distances
2026-06-01 • Computer Vision and Pattern Recognition
Computer Vision and Pattern RecognitionGraphics
AI summaryⓘ
The authors developed a new method called Ultra Diffusion Poser to improve tracking body movements using small wearable sensors called IMUs and distance measurements from ultra-wideband (UWB) signals. Unlike previous approaches that only used UWB distances as extra information, their method actually figures out where each sensor is placed in 3D space based on these distances. They then use this spatial layout along with sensor data in a special model to better estimate body poses. They also add a step during prediction to make sure the estimated poses match the measured distances. This approach made their pose tracking more accurate than earlier methods.
Inertial Measurement UnitsUltra-wideband Ranging3D Sensor LayoutDiffusion ModelPose EstimationSpatial Layout ModuleInter-sensor DistanceDrift MitigationHuman Motion CaptureUWB-Diffusion Guidance
Authors
Dominik Hollidt, Tommaso Bendinelli, Christian Holz
Abstract
Methods using inertial measurement units (IMUs) provide a wearable alternative to camera-based motion capture. To mitigate drift from inertial signals, recent sparse inertial pose estimators integrate inter-sensor distances measured by ultra-wideband (UWB) ranging. So far, UWB distances have only been used as an additional input feature, ignoring the physical constraints they impose on sensor positions. However, these distances can also be used to reconstruct the underlying 3D sensor layout, which in turn provides more informative input for pose reconstruction. We propose Ultra Diffusion Poser, a diffusion model that explicitly models these geometric constraints. It includes a Spatial Layout Module that analytically reconstructs the 3D sensor positions from UWB measurements. These sensor positions are used alongside IMU signals and UWB distances as a conditioning signal during diffusion. Still, network predictions can violate inter-sensor distance measurements. To address this, we introduce UWB-Diffusion Guidance, which encourages alignment between predicted poses and measured distances during diffusion sampling. Together, these contributions enable our model to achieve state-of-the-art performance, reducing joint position error by up to 22% over prior work.