Scene-aware Prediction of Diverse Human Movement Goals
2026-06-29 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors developed a method to predict where a person might go next by looking at a current photo of the scene and the person's body position. Since people can act unpredictably due to different goals, their method uses a special model called a Conditional Variational Autoencoder (CVAE) to suggest multiple possible future destinations. This helps robots or autonomous systems plan better by anticipating human movements without needing detailed knowledge about the environment. They tested their approach on two datasets and showed it works well in different situations.
human behavior predictiongoal predictionConditional Variational Autoencoderhuman poseRGB scenestochastic movementtrajectory predictionautonomous systemsGTA-IM datasetPROX dataset
Authors
Qiaoyue Yang, Amadeus Weber, Magnus Jung, Ayoub AI-Hamadi, Sven Wachsmuth
Abstract
Anticipation of human behaviours facilitates autonomous systems in proactive planning. Human behaviour could be stochastic due to varying goals. Human goals typically guide their own movement and could therefore help to predict the human trajectory and human motion in the long-term. To infer the human movement intentions, the environmental context plays a significant role, in addition to the social cues expressed by the individual. Previous works on human goals prediction either require semantic knowledge of the scene, or only tackle interactions with objects. In this paper, we propose a novel multi-goal prediction method using the generative model to address the stochasticity of human movement. It leverages the current RGB scene and the human pose to predict diverse potential future goals of human movement based on the Conditional Variational Autoencoder (CVAE). Our results demonstrate that our approach is capable of generating multiple movement goals in the scene via samplings in latent space of the CVAE and exhibits generalization capability across scenarios in GTA-IM dataset and PROX dataset. Code is publicly available at \href{https://github.com/Q-Y-Yang/DiverseGoalsPrediction.git}{\texttt{https://github.com/Q-Y-Yang/DiverseGoalsPrediction}}.