HumanFlow -- Diffusion-Driven MAV Navigation Among Humans via Tightly-Coupled Motion Tracking, Forecasting, and Control

2026-05-25Robotics

Robotics
AI summary

The authors developed HumanFlow, a new method that helps robots understand and predict how people move in 3D spaces, even when the person is partly hidden from view. Their approach combines tracking where people are now with forecasting where they'll go next, all while considering the surroundings. HumanFlow is better and faster than existing methods, especially when people are hard to see. The authors also connected this prediction system to a robot control method, allowing drones to navigate safely around people in simulations. This shows their system helps robots move efficiently and avoid collisions even when they can't see everything clearly.

3D human motion trackinglatent diffusion modelmotion forecastingocclusion handlingscene contextmodel predictive control (MPC)flow matchingsocial navigationMAV (micro aerial vehicle)partial observability
Authors
Simon Schaefer, Joshua Näf, Stefan Leutenegger
Abstract
Robust and accurate perception of humans in their 3D scene context is essential for integrating robots into everyday environments. Existing approaches, however, often fail to predict plausible and accurate human motion estimates that are consistent with the surrounding scene, especially in the presence of heavy occlusions or partial visibility. This can limit both safety and efficiency for robotic operations. We introduce HumanFlow, a latent diffusion model that unifies human motion tracking and forecasting, conditioned on the 3D scene context. We show that our human motion model produces smooth and accurate predictions under challenging conditions, including heavy occlusions, and outperforms state-of-the-art methods in tracking accuracy while being significantly more efficient. Furthermore, we show how HumanFlow's latent space can be tightly coupled with control by conditioning a flow-matching-based, approximate MPC policy on these representations. We validate our policy in simulation with real human trajectories for MAV social navigation, demonstrating superior navigation performance and remaining collision-free, even under partial observability of the human.