RECALL: Recovery Experience Collection for Active Lifelong Learning in Vision-Language-Action Models

2026-06-22Robotics

RoboticsArtificial IntelligenceMachine Learning
AI summary

The authors look at how robot systems that see, understand language, and act (Vision-Language-Action models) are usually improved by showing them more examples when they make mistakes, which can be slow and inefficient. They propose a smarter way where the robot actively asks for help in parts it’s unsure about, making learning faster. However, they found that focusing only on uncertain parts can cause the robot to forget what it already learned. To fix this, the authors test methods that balance learning new things with remembering old skills. Their study shows that while targeted help is useful, keeping a robot's old knowledge while learning new stuff remains challenging.

Vision-Language-Action (VLA) modelsPassive imitation learningActive learningUncertainty-guided data collectionContinual learningCatastrophic forgettingReplay-based data mixingElastic weight consolidationAutoregressive modelsRobot policy fine-tuning
Authors
Ulas Berk Karli, Tesca Fitzgerald
Abstract
Vision-Language-Action (VLA) models are commonly fine-tuned through passive imitation learning, where additional demonstrations are collected for tasks where the policy performs poorly. This approach incurs several downsides: it requires the robot to fail before data collection is triggered, provides little guidance about which states require supervision, and wastes demonstrator effort on redundant parts of the task where the policy already performs well. In this paper, we propose an active, continual learning paradigm for VLAs. We demonstrate that active, uncertainty-guided data collection leads to more efficient fine-tuning than when using passively-collected demonstrations. However, we also find that fine-tuning only on actively-collected recovery data leads to catastrophic forgetting. We evaluate techniques for continual learning, including replay-based data mixing and elastic weight consolidation, and identify tradeoffs between plasticity to uncertainty-guided recovery data and retention of previously learned behaviors. Overall, our work contributes an empirical study of active continual learning for autoregressive VLAs, establishing that uncertainty-guided recovery demonstrations can improve adaptation efficiency while also revealing open challenges when targeted new data is incorporated into large robot policies.