ATHENA: Accelerated Multi-Task Heterogeneous Influence Functions for Robot Data Curation

2026-06-15Robotics

Robotics
AI summary

The authors developed ATHENA, a method that helps robots learn from many example demonstrations more efficiently, even when using very large AI models with billions of parameters. They made the computations faster by simplifying some math steps and by cleverly handling how tasks influence each other when learning multiple tasks at once. Their tests showed that ATHENA can use less demonstration data but still perform just as well or better on robot tasks in both simulations and real robots. This work helps improve how robots learn from complex data in multitask scenarios.

robot imitation learninginfluence functionsVision-Language-Action modelsdata curationmultitask learningHessian matrixKronecker structurerandom truncated approximationfine-tuningRoboTwin
Authors
Tao Xu, Jiaxin Wang, Runhao Zhang, Jiayi Guan, Xianchao Zeng, Weixi Song, Xinyu Zhou, Zhetao Chen, Guang Chen, Yong-Lu Li
Abstract
In robot imitation learning, influence functions provide a principled approach to quantify each demonstration's effect on robot task outcomes, yet scaling them to billion-parameter Vision-Language-Action (VLA) models is limited by computational and multitask bottlenecks. To this end, we propose ATHENA, an influence function framework tailored for multitask VLA data curation at a billion-parameter scale. Concretely, it leverages the Kronecker structure of linear-layer gradients to reduce projection cost, and approximates dense Hessian inversion with a rank-r Random Truncated Approximation, achieving about a 313.4x speedup in influence computation. Furthermore, ATHENA formulates global and local interactive influence to balance data curation across 50 jointly trained tasks. Extensive evaluations on RoboTwin 2.0 and real-robot deployment, covering 9.34 and 6.90 hours of demonstrations, respectively, show that ATHENA matches or exceeds full-data joint fine-tuning using only 50% of demonstrations in simulation and 66.7% of data across six real-robot tasks. Overall, ATHENA demonstrates its effectiveness for data curation in billion-parameter multitask VLA fine-tuning.