Beyond Prompt-Based Planning: MCP-Native Graph Planning-based Biomedical Agent System

2026-06-03 • Artificial Intelligence

Artificial Intelligence

AI summaryⓘ

The authors identify problems in current biomedical agents, which struggle due to inconsistent software tools and simple planning methods. They create BioManus, a new system that organizes many bioinformatics tools into a structured graph to make planning easier and more efficient. BioManus converts diverse tools into a standard format and uses this graph to focus only on relevant parts for each task, improving accuracy and performance. Their tests show that this structured approach works better than previous methods. They propose that future biomedical reasoning should use these structured graphs instead of just larger lists of tools.

Biomedical agentsBioinformatics toolsMCP-nativeGraph-scaffolded planningBioinfoMCP CompilerWorkflow orchestrationContext compressionTool heterogeneityCapability graphsTask-specific retrieval

Authors

Zhangtianyi Chen, Florensia Widjaja, Wufei Dai, Xiangjun Zhang, Yuhao Shen, Juexiao Zhou

Abstract

Biomedical agents promise to automate complex biological workflows, yet current systems face two fundamental bottlenecks: bioinformatics tools are highly heterogeneous in interfaces and execution environments, while agent planning still relies on flat prompt-retrieved tool descriptions. As biomedical software ecosystems grow, this coupling between tool coverage and context size leads to tool confusion, unstable planning, and inefficient execution. We introduce BioManus, an MCP-native biomedical agent built on graph-scaffolded planning over structured biological capabilities. BioManus first introduces the BioinfoMCP Compiler, which converts heterogeneous bioinformatics software into standardized MCP servers, yielding a large executable MCP ecosystem. It then organizes this ecosystem as a typed heterogeneous MCP graph over tools, operations, datatypes, and workflow stages. At inference time, BioManus retrieves compact task-specific subgraphs, synthesizes operation-level workflow scaffolds. This design decouples planning complexity from raw tool inventory size, achieving a context compression ratio of Theta(N / (h * m_bar)) under high-recall retrieval, where N is the total tool count, h is the workflow horizon, and m_bar (much smaller than N) is the average number of candidate tools per operation. Experiments on BioAgentBench and LAB-Bench show that BioManus improves execution accuracy, workflow validity, and context efficiency over advanced biomedical agent baselines. This work suggests a paradigm shift: scalable biomedical reasoning requires structured executable capability graphs rather than increasingly larger prompt-level tool retrieval.

View PDFOpen arXiv