RoAd-RL: A Unified Library and Benchmark for Robust Adversarial Reinforcement Learning

2026-06-29 • Machine Learning

Machine LearningArtificial Intelligence

AI summaryⓘ

The authors introduce RoAd-RL, a new open-source tool designed to help researchers test how well reinforcement learning agents can handle tricky, harmful changes called adversarial perturbations. They provide a unified way to test different policies, attacks, and defenses in a reproducible manner, using popular tools like Stable-Baselines3 and Gymnasium. By running many experiments with different algorithms and scenarios, they found that some defenses may actually hurt performance more than the attacks themselves, while a method called temporal smoothing works well consistently. Their work offers a standard for comparing the robustness of reinforcement learning agents against attacks.

Deep Reinforcement LearningAdversarial PerturbationsBenchmarking FrameworkRobustness MetricsStable-Baselines3GymnasiumDQNPPOSACTemporal Smoothing

Authors

Adithya Mohan, Daniel Kriegl, Torsten Schön

Abstract

Deep Reinforcement Learning (DRL) has achieved significant success in robotics and autonomous systems, yet remains vulnerable to adversarial perturbations that can severely degrade performance. Research in adversarial reinforcement learning is often limited by fragmented implementations, inconsistent evaluation protocols, and poor reproducibility. To address these challenges, we present \textbf{RoAd-RL}, an open-source benchmarking framework that provides unified abstractions for policies, attacks, defenses, and robustness metrics, together with reproducible evaluation pipelines and seamless integration with Stable-Baselines3 and Gymnasium. We evaluate DQN, PPO, and SAC agents in LunarLander and Highway-v0 under 192 attack-defense configurations. Results reveal substantial variations in robustness across environments and show that some commonly used defenses can be more detrimental than the attacks they aim to mitigate, while temporal smoothing consistently achieves strong performance. RoAd-RL establishes a standardized benchmark for adversarial reinforcement learning research and is publicly available at https://pypi.org/project/road-rl.

View PDFOpen arXiv