Temporal Logic Guidance for Action-Only Diffusion Policies with World Models

2026-06-22Robotics

Robotics
AI summary

The authors studied how robots can better follow specific rules or instructions when deciding what actions to take. They improved a method that helps robots plan actions by using a separate model to check if the robot's planned behavior meets those rules, then adjusting the plan accordingly without needing to retrain the robot. This made the robot follow the rules much more closely while still completing its tasks successfully. Their approach works faster and more simply than previous methods that combined action and future state predictions.

diffusion policiesrobot behaviorSignal Temporal Logic (STL)world modelaction planningconstraint satisfactiongradient guidanceRobomimictask performance
Authors
Moritz Zoellner, Anastasios Manganaris, Rohan Paleja
Abstract
Diffusion policies enable multimodal robot behavior but offer limited ability to choose among behavior modes at inference time, even though such control is desirable in human-robot settings. Prior solutions to this lack of control have utilized Signal Temporal Logic (STL) to express human intentions and provide corresponding guidance for diffusion policy inference. However, these approaches can only guide diffusion policies that jointly generate future actions and states, increasing both complexity and runtime. We propose a novel guidance method for action-only diffusion policies that uses a separate learned world model to enable differentiable evaluation of STL robustness, with its gradient then injected into the diffusion process. This steers behavior toward constraint satisfaction without retraining, improving constraint adherence while preserving task performance. On the Can Transport task from Robomimic, our method maintains 100% task success while reducing constraint violations from over 80% for baseline methods to 4%. We also discuss extensions toward improved robustness and more complex constraints.