Parallel Differentiable Reachability for Learning and Planning with Certified Neural Dynamics and Controllers
2026-05-25 • Robotics
RoboticsArtificial IntelligenceMachine Learning
AI summaryⓘ
The authors developed a new method to predict and control robot behaviors that involve neural networks, in a way that is both accurate and fast enough for real-time use. Their approach uses math techniques to tightly estimate all possible system behaviors under uncertainty while allowing easy calculation of gradients, which helps in training and planning. They tested their method on complex robotic tasks, showing it can work on real hardware and high-dimensional models while providing reliable safety guarantees. This makes it easier to build smarter, safer robots that can plan actions on the fly.
Neural network dynamicsReachability analysisTaylor-model flowpipeCROWN linear bound propagationModel predictive control (MPC)Automatic differentiationGPU accelerationCertified trainingNon-prehensile manipulationQuadrotor control
Authors
Keyi Shen, Glen Chou
Abstract
Neural network (NN) dynamics models and control policies achieve strong performance in robotics, but providing sound guarantees under uncertainty remains difficult, especially for closed-loop NN systems. Existing reachability tools provide formal over-approximations, yet are often non-differentiable, overly conservative, or too slow for modern learning and online planning pipelines. To address this, we present a parallelizable, differentiable reachability framework in JAX for continuous- and discrete-time systems with analytical and NN-based dynamics and controllers. Our framework combines Taylor-model flowpipe construction with CROWN-style linear bound propagation through a unified representation that preserves affine dependencies while supporting GPU-batched computation and automatic differentiation. Building on this reachability primitive, we develop (i) a certified training method that encourages reachability-friendly dynamics models and controllers, and (ii) a reachability-aware sampling-based MPC scheme with gradient-based refinement. Experiments on non-prehensile manipulation and quadrotor tasks, including hardware and higher-dimensional evaluations (up to 72D), demonstrate practical online planning while maintaining certified reachable-set over-approximations under bounded uncertainty.