FERA: Uncertainty-Aware Federated Reasoning for Large Language Models
2026-05-11 • Computation and Language
Computation and Language
AI summaryⓘ
The authors explore how multiple clients with private data can help improve complex reasoning without sharing raw data, which is important when data can't be centralized. They propose a method called FERA where clients and a central server work together in rounds, exchanging reasoning steps and uncertainty estimates to gradually make better answers. Their approach can identify and fix mistakes in clients' inputs instead of ignoring them, improving trust in the results over time. They also prove that this iterative process reliably converges and show in experiments that it performs better than other similar methods, while staying efficient.
Large language modelsFederated learningMulti-step reasoningUncertainty estimationIterative refinementSelf-critiqueAggregation methodsConvergence guaranteesPrivacy-preserving AICommunication efficiency
Authors
Ruhan Wang, Chengkai Huang, Zhiyong Wang, Junda Wu, Rui Wang, Tong Yu, Julian McAuley, Lina Yao, Dongruo Zhou
Abstract
Large language models (LLMs) exhibit strong reasoning capabilities when guided by high-quality demonstrations, yet such data is often distributed across organizations that cannot centralize it due to regulatory, proprietary, or institutional constraints. We study federated reasoning, where a server improves multi-step reasoning by coordinating with heterogeneous clients holding private demonstrations, without centralized training or raw data sharing. The key challenge is that client reliability is query-dependent, while the server cannot inspect client data to determine which contributions are trustworthy. To address this, we propose Uncertainty-Aware Federated Reasoning (FERA), a training-free framework based on iterative server-client co-refinement. Across communication rounds, clients generate reasoning traces with lightweight uncertainty estimates, and the server synthesizes them into improved reasoning that is redistributed as context for the next round, progressively improving both server outputs and client-side reasoning. Within each round, Uncertainty-Aware Self-Critique Aggregation (UA-SCA) resolves conflicts among heterogeneous client traces through query-dependent trust weighting and structured cross-client verification. Rather than simply discarding low-quality traces, UA-SCA revises flawed reasoning steps to recover useful information. We provide theoretical guarantees showing that the proposed iterative protocol converges and that uncertainty-aware weighting accelerates convergence. Experiments on multiple reasoning benchmarks show that FERA consistently outperforms both federated training and training-free baselines, achieving progressively higher accuracy across rounds while maintaining communication and computational efficiency.