SING: Synthetic Intention Graph for Scalable Active Tool Discovery in LLM Agents

2026-06-15 • Computation and Language

Computation and Language

AI summaryⓘ

The authors studied how large language models (LLMs) use many different tools to complete complex tasks. They found that listing all possible tools in advance is inefficient and limits flexibility. To fix this, they created SING, a system that smartly picks tools based on the user's changing needs and task progress by connecting user goals with tool abilities in a graph. Tests on many real-world tools showed SING finds the right tools more accurately while needing to look at fewer tools overall. This means agents can work better and faster in environments with many possible tools.

large language modelsagent harnesstool selectionretrieval-augmentedintention-awaregraph structuretool capabilitiestask decompositionmulti-turn executiontool ecosystem

Authors

Qiao Xiao, Haochen Shi, Yisen Gao, Wenbin Hu, Huihao Jing, Tianshi Zheng, Baixuan Xu, Ziheng Zhang, Weiqi Wang, Haoran Li, Jiaxin Bai, Yangqiu Song

Abstract

Large language model (LLM) agents increasingly rely on agent harnesses that manage context, tools, and multi-turn execution, making tools a central interface for acting in realistic digital environments. As harness-connected tool ecosystems expand to hundreds or thousands of APIs, services, and task-specific skills, exhaustive tool schema injection becomes costly and imposes a closed-world assumption that limits agents to a predefined static inventory. Retrieval-augmented tool selection offers a natural alternative, but existing one-shot retrieval methods often fail to align isolated tool descriptions with the agent's true task intention, especially in long-horizon tasks where required capabilities emerge through decomposition, observations, and newly induced subgoals. We propose SING, an intention-aware active tool discovery framework that builds an intention-tool graph linking user intentions, tool capabilities, and tool collaboration patterns, and dynamically retrieves tools according to evolving task states. Using a unified corpus of 7,471 tools, we evaluate SING on three real-world tool-use benchmarks. SING improves Global Recall@5 by up to 59.8% and downstream success rate by up to 28.9% over baselines, while reducing full-corpus tool-schema exposure by 99.8%, demonstrating that intention-aware graph structure enables more accurate and context-efficient tool discovery in large-scale agentic ecosystems.

View PDFOpen arXiv