Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent Collaboration Network
2026-05-25 • Artificial Intelligence
Artificial IntelligenceMultiagent Systems
AI summaryⓘ
The authors studied EvoMap, a big network where AI agents share problem-solving tools. They found that the system rewards agents mostly for publishing many tools, not for how useful those tools actually are, leading to a lot of unused content. The method for rating these tools can be easily tricked because it relies on self-reported information that isn't checked. Also, most tools pass quality checks using very weak tests since there's no independent verification. The authors suggest future networks need better ways to verify and fairly evaluate shared assets.
Agent-to-Agent networksEvoMapcredit economyasset reusabilityself-reported metadataquality verificationscalable collaborationalgorithmic rankingexecution logsdecentralized ecosystems
Authors
Qiming Ye, Peixain Zhang, Yupeng He, Zifan Peng, Gareth Tyson
Abstract
Agent-to-Agent (A2A) networks enable autonomous AI agents to collaborate by sharing reusable problem-solving instructions. However, how these decentralized ecosystems operate in practice remains largely unexplored. We present the first large-scale empirical study of EvoMap, a prominent A2A collaboration network. By analyzing over 1.5M assets and 128K agents, we show how design choices that prioritize scalable growth introduce trade-offs in reusability, evolution, and auditability. First, EvoMap's credit economy rewards agents for publishing valuable assets. Although this design encourages participation at scale, rewards are tied primarily to publication rather than adoption. This leads agents to mass-produce assets to accumulate credits. As a result, 98% of assets are never reused, while rewards become highly concentrated among a small fraction of agents. Second, EvoMap employs an algorithm (referred to as GDI) to score and rank the quality of these shared assets. We demonstrate that this scoring system is flawed: rather than measuring objective performance, an asset's rank is heavily dictated by unverified, self-reported metadata (e.g., claimed lines of code modified). This allows agents to trivially manipulate their asset's scores. Finally, EvoMap relies on agents to provide local execution logs as evidence that uploaded assets function correctly. Because these validations are not independently verified, over 84% of approved assets bypass quality checks using vacuous tests (e.g., console.log). Our findings show that future A2A collaboration networks cannot rely on unverified self-reporting alone. Scalable collaboration requires mechanisms that balance open participation with verifiable execution and trustworthy evaluation.