CoorDex: Coordinating Body and Hand Priors for Continuous Dexterous Humanoid Loco-Manipulation

2026-06-22Robotics

RoboticsArtificial IntelligenceMachine Learning
AI summary

The authors present CoorDex, a method for teaching humanoid robots with dexterous hands to walk and manipulate objects at the same time, rather than stopping to use their hands. They use simulated demonstrations to create special motion teachers for the robot's body and hands and then combine these into a learning system that controls both together smoothly. Their approach helps the robot perform tasks like grasping a bottle, opening a fridge door, and turning objects while moving. They also show that previous simpler methods struggle with these tasks, but their coordinated approach works better under similar training conditions.

humanoid robotloco-manipulationhigh degree-of-freedom (DoF)reinforcement learninglatent residual controlproximal policy optimization (PPO)dexterous manipulationmotion trackingsimulated demonstrationsproprioception
Authors
Sikai Li, Shuning Li, Zhenyu Wei, Yunchao Yao, Chenran Li, Mingyu Ding
Abstract
Humanoid loco-manipulation is often simplified into a stop-and-go process: walking to an object, stopping to manipulate it, and then resuming locomotion. It also commonly relies on low degree-of-freedom (DoF) end effectors that behave like an open-close grasp primitive. We introduce CoorDex, a learning pipeline that converts high-dimensional body and dexterous hand control into coordinated latent residual control, enabling high-DoF dexterous loco-manipulation on the move. Starting from simulated whole-body and hand demonstrations, CoorDex trains privileged motion tracking teachers for the humanoid body and dexterous hand, distills them into proprioception-conditioned latent priors, and uses the frozen priors as the action space for downstream residual reinforcement learning. A coordinated latent residual policy composes these priors through shared task context and separate body-hand residual heads, preserving natural whole-body motion while improving finger-level contact reliability. CoorDex enables a Unitree G1 humanoid with a 20-DoF WUJI hand to execute dexterous manipulation while in motion, including non-stop bottle grasping and carrying, fridge door opening on the move, and cube pick-and-turn. Ablations on the walk-grasp-carry task show that joint-space PPO, joint-space hand control, and monolithic latent prediction all fail under the same reward budget, while the latent-prior interface and coordinated residual structure make high-dimensional contact-rich loco-manipulation trainable. Project Page: https://skevinci.github.io/coordex/