HoloAgent-0: A Unified Embodied Agent Framework with 3D Spatial Memory
2026-06-22 • Robotics
RoboticsComputer Vision and Pattern Recognition
AI summaryⓘ
The authors created HoloAgent-0, a system that helps robots follow language instructions and perform tasks in the real world by organizing skills and managing resources efficiently. Unlike previous methods that used separate parts for navigation or manipulation, their system unifies these capabilities into a single framework with three layers: execution control, 3D spatial memory, and robot actions. They tested HoloAgent-0 on real robots doing tasks like moving around, finding objects, and working together, showing it can handle complex, long tasks while adjusting to feedback. This work aims to make robot control more integrated and adaptive in physical environments.
Embodied AgentSkill GraphClosed-loop Execution3D Spatial MemoryMobile ManipulationRobot CoordinationRuntime FeedbackNavigationRobot Control FrameworkEmbodiment
Authors
Xiaolin Zhou, Liu Liu, Tingyang Xiao, Wei Feng, Fa Fu, Xinrui Meng, Xinjie Wang, Jialiang Han, Boyang Yu, Yun Du, Wei Sui, Zhizhong Su
Abstract
LLM agents follow a practical execution loop in digital environments: they reason over structured states, invoke tools, inspect feedback, and revise actions. Extending this loop to physical robots is difficult because physical execution is continuous, embodiment-dependent, uncertain, and constrained by safety. Existing embodied-AI systems have advanced manipulation, spatial understanding, navigation, and humanoid control, but these capabilities often remain specialized modules or loosely coupled decision loops. In this work, we introduce HoloAgent-0, a unified embodied agent framework for real-world robot deployment. Embodied AgentOS converts language instructions into executable skill graphs, schedules robot resources, monitors execution, and triggers clarification or re-planning from runtime feedback. HoloAgent-0 organizes heterogeneous robot models and controllers through three coupled layers: Embodied AgentOS for closed-loop execution, 3D spatial memory for physical world grounding, and embodied skills for robot action. We deploy HoloAgent-0 on real hardware and evaluate its spatial memory, long-horizon navigation, and closed-loop execution across motion generation, object search, cross-robot coordination, and mobile manipulation.