Action-Prior Denoising for Smooth Real-Time Chunking

2026-05-25 • Robotics

Robotics

AI summaryⓘ

The authors address a problem where robotic actions need to be planned in chunks but with some delay between planning and execution. They improve a method called Real-time Chunking (RTC) by introducing Soft RTC, which better handles uncertainty during overlapping actions by blending previous plans softly rather than treating them as fixed or totally free. Their approach matches or improves performance compared to older methods without adding extra runtime cost. They also show early evidence that Soft RTC helps a real robot perform sorting tasks more smoothly.

Real-time chunkingAction policiesInference delayDenoisingKinetix datasetRobotic action planningOverlap actionsToken-wise blendingAction smoothness metricsReal-robot sorting

Authors

Dongyang Liu, Zhaowen Zheng, Yu Sun, Longxu Zhang, Yixuan Liu, Hao Wan

Abstract

Real-time chunking (RTC) lets chunked action policies operate under inference delay by conditioning a newly generated action chunk on actions already committed by the previous chunk. Training-time RTC simulates this delay during learning and avoids expensive guidance at deployment, but its binary prefix mask treats all non-prefix tokens as fully unconstrained. This under-models asynchronous execution: early overlap actions are fixed, while later overlap actions remain editable but should still stay close to the previous plan. We propose Soft RTC, a training-time RTC generalization based on action-prior denoising. Soft RTC constructs corrupted overlap tokens from partially denoised states instead of pure noise and injects the aligned previous chunk as the same prior during inference through a lightweight token-wise blending rule. On the 12 released large Kinetix levels, a short soft window nearly matches hard training-time RTC in overall solve rate (0.809 vs. 0.815), while a medium window reduces high-delay action delta and jerk by 9.1% and 9.6% relative to hard RTC. Both variants keep near-naive runtime, unlike inference-time RTC baselines. A small preliminary real-robot sorting study provides additional evidence that training-time RTC can improve completion and that Soft RTC gives the lowest commanded-action finite-difference metrics among the tested policies.

View PDFOpen arXiv