Security of OpenClaw Agents: Fundamentals, Attacks, and Countermeasures

2026-05-25Artificial Intelligence

Artificial Intelligence
AI summary

The authors describe OpenClaw, a new type of AI system that can run on its own continuously, remember past interactions, and work with many tools at once. These features let OpenClaw handle complicated tasks by itself, but also create new security risks like hacking or tricking the AI. The paper reviews the architecture of OpenClaw, identifies different kinds of security threats at various steps, and looks at current defenses. The authors also point out ongoing challenges in making these systems reliable and trustworthy.

Large Language ModelsAutonomous AgentsPersistent MemorySkill AugmentationAttack SurfaceSkill PoisoningCognitive ManipulationMulti-agent SystemsSupply Chain VulnerabilitiesAI Security
Authors
Yuntao Wang, Jianle Ba, Han Liu, Yanghe Pan, Jintao Wei, Zhou Su, Tom H. Luan, Linkang Du
Abstract
The rapid evolution of large language model (LLM)-driven autonomous agents has given rise to OpenClaw, a new class of open-source agent frameworks that operate as continuously running, skill-augmented systems with persistent memory, multi-channel interaction, and high degrees of autonomy. Such capabilities enable OpenClaw agents to autonomously execute complex, multi-step tasks and interact seamlessly with external applications, but simultaneously introduce a substantially enlarged attack surface. In particular, the combination of high-privilege operations and persistent memory exposes OpenClaw agents to various emerging threats, including skill poisoning, cognitive manipulation, multi-agent cascading failures, and supply-chain vulnerabilities. In this survey, we present a comprehensive study of the security landscape of OpenClaw agents. We first examine the general architecture and key characteristics that distinguish OpenClaw agents from traditional AI agent systems. We categorize existing security and privacy threats into a layered framework and analyze how vulnerabilities arise during agent reasoning, action execution, and external interaction. Representative defense mechanisms are also reviewed to draw the current defense landscape. Finally, several unresolved issues related to the reliability and trustworthiness of OpenClaw ecosystems are discussed.