Every Response Counts: Quantifying Uncertainty of LLM-based Multi-Agent Systems through Tensor Decomposition

2026-04-09 • Machine Learning

Machine LearningArtificial IntelligenceComputation and Language

AI summaryⓘ

The authors focus on improving how groups of AI agents working together (called Multi-Agent Systems) measure how unsure they are about their decisions. Existing methods don’t work well because these systems have many back-and-forth steps and complex ways that agents communicate. They created a new method called MATU, which looks at the whole reasoning process as data structures and uses a math technique called tensor decomposition to separate and measure different types of uncertainty. Their experiments show MATU gives a better overall picture of reliability in these multi-agent setups.

Large Language ModelsMulti-Agent SystemsUncertainty QuantificationTensor DecompositionEmbedding MatricesReasoning TrajectoriesInter-Agent CommunicationCommunication TopologiesReliability Measurement

Authors

Tiejin Chen, Huaiyuan Yao, Jia Chen, Evangelos E. Papalexakis, Hua Wei

Abstract

While Large Language Model-based Multi-Agent Systems (MAS) consistently outperform single-agent systems on complex tasks, their intricate interactions introduce critical reliability challenges arising from communication dynamics and role dependencies. Existing Uncertainty Quantification methods, typically designed for single-turn outputs, fail to address the unique complexities of the MAS. Specifically, these methods struggle with three distinct challenges: the cascading uncertainty in multi-step reasoning, the variability of inter-agent communication paths, and the diversity of communication topologies. To bridge this gap, we introduce MATU, a novel framework that quantifies uncertainty through tensor decomposition. MATU moves beyond analyzing final text outputs by representing entire reasoning trajectories as embedding matrices and organizing multiple execution runs into a higher-order tensor. By applying tensor decomposition, we disentangle and quantify distinct sources of uncertainty, offering a comprehensive reliability measure that is generalizable across different agent structures. We provide comprehensive experiments to show that MATU effectively estimates holistic and robust uncertainty across diverse tasks and communication topologies.

View PDFOpen arXiv