Building Better Environments for Autonomous Cyber Defence

2026-04-09Cryptography and Security

Cryptography and SecurityArtificial Intelligence
AI summary

The authors organized a workshop in 2025 to discuss what makes a good environment for training AI agents to defend computer networks automatically. Experts from different backgrounds shared their practical knowledge about reinforcement learning (RL) and cyber defence environments. From these discussions, the authors created a framework to better connect RL environments with real network systems and offered guidelines for building and testing these environments effectively. Their goal is to help improve how AI agents learn to protect important computer networks.

reinforcement learningautonomous cyber defenceRL environmentsnetwork defencecritical infrastructureagent evaluationcybersecuritymachine learningenvironment designgovernment networks
Authors
Chris Hicks, Elizabeth Bates, Shae McFadden, Isaac Symes Thompson, Myles Foley, Ed Chapman, Nickolas Espinosa Dice, Ankita Samaddar, Joshua Sylvester, Himanshu Neema, Nicholas Butts, Nate Foster, Ahmad Ridley, Zoe M, Paul Jones
Abstract
In November 2025, the authors ran a workshop on the topic of what makes a good reinforcement learning (RL) environment for autonomous cyber defence (ACD). This paper details the knowledge shared by participants both during the workshop and shortly afterwards by contributing herein. The workshop participants come from academia, industry, and government, and have extensive hands-on experience designing and working with RL and cyber environments. While there is now a sizeable body of literature describing work in RL for ACD, there is nevertheless a great deal of tradecraft, domain knowledge, and common hazards which are not detailed comprehensively in a single resource. With a specific focus on building better environments to train and evaluate autonomous RL agents in network defence scenarios, including government and critical infrastructure networks, the contributions of this work are twofold: (1) a framework for decomposing the interface between RL cyber environments and real systems, and (2) guidelines on current best practice for RL-based ACD environment development and agent evaluation, based on the key findings from our workshop.