Every Response Counts: Quantifying Uncertainty of LLM-based Multi-Agent Systems through Tensor Decomposition
2026-04-09 • Machine Learning
Machine LearningArtificial IntelligenceComputation and Language
AI summaryⓘ
The authors focus on improving how groups of AI agents working together (called Multi-Agent Systems) measure how unsure they are about their decisions. Existing methods don’t work well because these systems have many back-and-forth steps and complex ways that agents communicate. They created a new method called MATU, which looks at the whole reasoning process as data structures and uses a math technique called tensor decomposition to separate and measure different types of uncertainty. Their experiments show MATU gives a better overall picture of reliability in these multi-agent setups.
Large Language ModelsMulti-Agent SystemsUncertainty QuantificationTensor DecompositionEmbedding MatricesReasoning TrajectoriesInter-Agent CommunicationCommunication TopologiesReliability Measurement
Authors
Tiejin Chen, Huaiyuan Yao, Jia Chen, Evangelos E. Papalexakis, Hua Wei
Abstract
While Large Language Model-based Multi-Agent Systems (MAS) consistently outperform single-agent systems on complex tasks, their intricate interactions introduce critical reliability challenges arising from communication dynamics and role dependencies. Existing Uncertainty Quantification methods, typically designed for single-turn outputs, fail to address the unique complexities of the MAS. Specifically, these methods struggle with three distinct challenges: the cascading uncertainty in multi-step reasoning, the variability of inter-agent communication paths, and the diversity of communication topologies. To bridge this gap, we introduce MATU, a novel framework that quantifies uncertainty through tensor decomposition. MATU moves beyond analyzing final text outputs by representing entire reasoning trajectories as embedding matrices and organizing multiple execution runs into a higher-order tensor. By applying tensor decomposition, we disentangle and quantify distinct sources of uncertainty, offering a comprehensive reliability measure that is generalizable across different agent structures. We provide comprehensive experiments to show that MATU effectively estimates holistic and robust uncertainty across diverse tasks and communication topologies.