Agent-Centric Social Trajectory Prediction: A Free Energy Principle Perspective

2026-05-25 • Artificial Intelligence

Artificial Intelligence

AI summaryⓘ

The authors present FEP-Diff, a new way to predict how agents (like people or cars) will move, even when we can't see everything around them. Their approach uses a theory called the Free Energy Principle to make predictions that match how agents think and behave. They combine information about an agent’s own motion and its social surroundings, then guess future paths using a special learning method that keeps predictions consistent with nearby agents. Tests show their method works better than others when visibility is limited.

trajectory predictionFree Energy Principlepartial observabilityspatiotemporal encodingbelief inferencesocial interaction modelinglatent distributionsdiffusion modelsmultimodal predictionego-motion dynamics

Authors

Yanping Wu, Ji Zhang, Hao Chen, Edmond S. L. Ho, Chongfeng Wei

Abstract

Trajectory prediction methods have demonstrated remarkable capabilities in capturing complex motion patterns. However, existing methods rely on global state assumptions, suffer from insufficient belief inference under partial observability, and lack cognitive behavioral constraints in prediction. These limitations severely compromise both deployment feasibility and physical plausibility in real-world settings. In this work, we propose FEP-Diff, an agent-centric trajectory prediction framework grounded in the Free Energy Principle, aimed at achieving cognitively plausible predictions under realistic constraints. Specifically, a dual-branch spatiotemporal encoder extracts ego-motion dynamics and social interaction cues from local observations. Building upon this, a goal-conditioned belief learner infers multimodal latent belief distributions optimized via a free-energy objective, with a social consistency constraint on the local neighborhood graph to promote cognitive alignment among neighboring agents. Finally, a residual diffusion trajectory generator is conditioned on the learned belief representations with token-level proxy conditioning, producing precise and diverse future predictions. Extensive experiments on five public benchmarks demonstrate that FEP-Diff consistently outperforms state-of-the-art methods under restricted observability. Code: https://anonymous.4open.science/r/FEP-Diff-8876.

View PDFOpen arXiv