Evo-Attacker: Memory-Augmented Reinforcement Learning for Long-Horizon Tool Attacks on LLM-MAS

2026-05-25 • Cryptography and Security

Cryptography and SecurityArtificial IntelligenceMultiagent Systems

AI summaryⓘ

The authors studied how systems made of many AI agents working together can be tricked by faulty or harmful tools they use. They created Evo-Attacker, a smart method that learns and improves its attacks over time by remembering past tricks and planning new ones carefully. This method helps find better ways to confuse the system by focusing on important steps during problem solving. Their tests show Evo-Attacker is better than older methods and reveal the need to protect these AI systems from such attacks.

Large Language ModelsMulti-Agent SystemsReinforcement LearningAdversarial AttacksMemory-Augmented LearningCredit Assignment ProblemAttack-Flow GRPOTool SecurityDeliberative Reasoning

Authors

Bingyu Yan, Xiaoming Zhang, Jinyu Hou, Chaozhuo Li, Ziyi Zhou, Yiming Hei, Litian Zhang

Abstract

While Large Language Model-based Multi-Agent Systems (LLM-MAS) demonstrate remarkable capabilities in solving complex tasks by orchestrating specialized agents and external tools, the implicit trust in tool outputs creates a critical attack surface. Existing tool attacks are limited by domain specificity or fixed and static templates. To address these challenges, we propose Evo-Attacker, which formulates the tool attack as a self-evolving, memory-augmented reinforcement learning process. Evo-Attacker constructs a dynamic attack memory and employs deliberative reasoning to retrieve adversarial patterns and strategize modifying interventions at critical moments. Furthermore, we introduce Attack-Flow GRPO to optimize intermediate reasoning steps via terminal outcomes, addressing the long-horizon credit assignment challenge. Comprehensive experiments demonstrate that Evo-Attacker consistently outperforms baselines, highlighting its generalization and evolutionary capabilities and the urgent need for defensive tool safeguards.

View PDFOpen arXiv