RoamFlow: Reinforcement-Aligned One-Step Action MeanFlow Policy for Image-Goal Navigation

2026-06-29Robotics

Robotics
AI summary

The authors focus on teaching robots how to find places using just a picture of the destination. They noticed current methods have trouble planning long routes, so they created RoamFlow, which predicts the average direction to move, making it quicker and more efficient. They trained their system first by mimicking expert behavior, then by using trial-and-error learning to get better at the task. Tests in both virtual and real environments showed their method works well and is fast enough for real-time use.

image-goal navigationembodied roboticsreinforcement learningtrajectory synthesisMeanFlowimitation learningpolicy refinementHabitat simulationreal-time inference
Authors
Zixuan Zhang, Yuqi Chen, Junjie Gao, Siyuan Song, Yongzhou Pan, Beichen Wang, Mir Feroskhan
Abstract
Image-goal navigation is a key challenge in embodied robotics, where an agent must reach a target specified solely by a goal image. While existing reinforcement learning approaches map perceptual observations directly to actions, they struggle to model long-horizon dependencies, often leading to suboptimal trajectories. To address this limitation, we propose RoamFlow, a generative navigation framework that leverages MeanFlow to predict the average velocity field for trajectory synthesis, enabling efficient few-step generation and reducing inference latency. We further adopt a two-stage training strategy that combines expert imitation for stable initialization with reinforcement learning for task-specific policy refinement. Extensive experiments in both Habitat simulation and real-world robotic platforms demonstrate that RoamFlow achieves efficient inference while maintaining strong navigation performance under real-time constraints.