Hand-in-the-Loop: Improving Dexterous VLA via Seamless Interventional Correction

2026-05-14Robotics

RoboticsMachine Learning
AI summary

The authors study how robots with complex hands can be controlled better when humans help correct their actions during tasks like using tools or handling objects with both hands. They found that traditional methods where humans take full control cause sudden jerky movements called "gesture jumps." Their new method, Hand-in-the-Loop (HandITL), smoothly mixes human corrections with the robot's own actions to avoid these jumps. This approach makes robot control more stable, reduces mistakes, and speeds up task completion. Using data from HandITL also helps train better robot policies compared to standard human control data.

Vision-Language-Action (VLA) modelsdexterous manipulationInteractive Imitation Learning (IIL)human-in-the-loopteleoperationbimanual coordinationgesture jumpspolicy refinementhigh-dimensional action spacesrobotic hands
Authors
Zhuohang Li, Liqun Huang, Wei Xu, Zhengming Zhu, Nie Lin, Xiao Ma, Xinjun Sheng, Ruoshi Wen
Abstract
Vision-Language-Action (VLA) models are prone to compounding errors in dexterous manipulation, where high-dimensional action spaces and contact-rich dynamics amplify small policy deviations over long horizons. While Interactive Imitation Learning (IIL) can refine policies through human takeover data, applying it to high-degree-of-freedom (DoF) robotic hands remains challenging due to a command mismatch between human teleoperation and policy execution at the takeover moment, which causes abrupt robot-hand configuration changes, or "gesture jumps". We present Hand-in-the-Loop (HandITL), a seamless human-in-the-loop intervention method that blends human corrective intent with autonomous policy execution to avoid gesture jumps during bimanual dexterous manipulation. Compared with direct teleoperation takeover, HandITL reduces takeover jitter by 99.8% and preserves robust post-takeover manipulation, reducing grasp failures by 87.5% and mean completion time by 19.1%. We validate HandITL on tasks requiring bimanual coordination, tool use, and fine-grained long-horizon manipulation. When used to collect intervention data for policy refinement, HandITL yields policies that outperform those trained with standard teleoperation data by 19% on average across three long-horizon dexterous tasks.