DIFF-IPPO: Diffusion-Based Informative Path Planning with Open-Vocabulary Belief Maps
2026-06-15 • Robotics
Robotics
AI summaryⓘ
The authors developed a method called DIFF-IPPO to help robots or drones search for objects more efficiently by combining a special map that shows where things might be with a smart way of planning routes. This method uses a technique called diffusion-based planning to create paths that focus on the most promising areas on the map. They tested it in a simulated rescue mission where drones looked for a burning building and found it quickly. The results showed that their method can effectively guide searches by focusing sensor coverage on important spots.
informative path planningdiffusion-based planneropen-vocabulary belief mapGaussian processtrajectory generationsensor coveragemulti-modal beliefsearch-and-rescue roboticstarget detectionrobot perception
Authors
Sausar Karaf, Oleg Sautenkov, Mikhail Martynov, Dzmitry Tsetserukou
Abstract
Exploration and object search require robots to perceive their environment, identify regions of interest, and plan trajectories that improve target-detection likelihood or maximize information gain. Many IPP methods, especially in continuous environmental monitoring, rely on Gaussian-process belief models, while object-search settings often produce complex, multimodal belief maps from semantic or open-vocabulary perception. Global trajectory generation directly conditioned on such non-Gaussian belief maps remains comparatively underexplored. Although diffusion-based planners offer strong capabilities for modeling such distributions, their use in informative path planning remains limited. In this work, we propose DIFF-IPPO, a pipeline that integrates an open-vocabulary belief map generator with a diffusion-based planner for global trajectory generation over belief maps. The method generates trajectories that concentrate sensor coverage over high-belief regions, achieving normalized detection scores between 81.49% and 86.55% across different dataset scenarios. We validate the system in a simulated search-and-rescue scenario where the planner searches candidate building regions to locate a burning building. In this setting, a team of five drones using batched belief-map-conditioned trajectory generation achieves first detections in 3.5 minutes.