Flying to Image-Specified Objects: 3D Quadrotor Navigation via Cross-Graph Memory and Viewpoint Planning
2026-06-29 • Robotics
Robotics
AI summaryⓘ
The authors created a system to help drones find a specific object shown in a picture by flying toward it. Because flying a drone is tricky with limited views and safety needs, their method plans where to look first instead of flying straight to a spot. They use a memory of seen objects and smart choices to decide the best next move, making sure the drone explores and keeps good views to find the target. Tests in simulations and real flights showed their approach works better and is safe.
Instance-Specific NavigationQuadrotor3D Flight ControlField of ViewSemantic MemoryHierarchical NavigationTrajectory PlanningAction NodesFrontier ExplorationLearning-based Policy
Authors
Junjie Gao, Yuqi Chen, Yongzhou Pan, Yaosheng Deng, Jiaping Xiao, Mir Feroskhan
Abstract
Instance-Specific Image-Goal Navigation (InstanceImageNav) requires a robot to navigate toward the exact object instance depicted in a query image. Extending this task to quadrotors is challenging due to continuous 3D control, limited field of view (FOV), and safety constraints, which make successful navigation highly dependent on selecting informative viewpoints. We propose a hierarchical navigation framework for quadrotor InstanceImageNav that separates high-level decision making from low-level motion execution. Instead of navigating directly to spatial locations, the system generates viewpoint-aware action nodes around frontier regions and potential target objects, enabling the robot to explore while maintaining informative viewpoints for detecting the target instance. A lightweight semantic memory maintains object-level and observation-level context, allowing semantic cues to propagate to candidate action nodes for decision making. A learning-based policy selects the most promising action node, and a trajectory planner generates dynamically feasible 3D flight paths for safe execution. Experiments in simulation demonstrate consistent improvements over strong baselines, and real-world quadrotor flights validate the practicality and robustness of the proposed framework.