FERA: Uncertainty-Aware Federated Reasoning for Large Language Models

2026-05-11 • Computation and Language

Computation and Language

AI summaryⓘ

The authors explore how multiple clients with private data can help improve complex reasoning without sharing raw data, which is important when data can't be centralized. They propose a method called FERA where clients and a central server work together in rounds, exchanging reasoning steps and uncertainty estimates to gradually make better answers. Their approach can identify and fix mistakes in clients' inputs instead of ignoring them, improving trust in the results over time. They also prove that this iterative process reliably converges and show in experiments that it performs better than other similar methods, while staying efficient.

Large language modelsFederated learningMulti-step reasoningUncertainty estimationIterative refinementSelf-critiqueAggregation methodsConvergence guaranteesPrivacy-preserving AICommunication efficiency

Authors

Ruhan Wang, Chengkai Huang, Zhiyong Wang, Junda Wu, Rui Wang, Tong Yu, Julian McAuley, Lina Yao, Dongruo Zhou

Abstract

Large language models (LLMs) exhibit strong reasoning capabilities when guided by high-quality demonstrations, yet such data is often distributed across organizations that cannot centralize it due to regulatory, proprietary, or institutional constraints. We study federated reasoning, where a server improves multi-step reasoning by coordinating with heterogeneous clients holding private demonstrations, without centralized training or raw data sharing. The key challenge is that client reliability is query-dependent, while the server cannot inspect client data to determine which contributions are trustworthy. To address this, we propose Uncertainty-Aware Federated Reasoning (FERA), a training-free framework based on iterative server-client co-refinement. Across communication rounds, clients generate reasoning traces with lightweight uncertainty estimates, and the server synthesizes them into improved reasoning that is redistributed as context for the next round, progressively improving both server outputs and client-side reasoning. Within each round, Uncertainty-Aware Self-Critique Aggregation (UA-SCA) resolves conflicts among heterogeneous client traces through query-dependent trust weighting and structured cross-client verification. Rather than simply discarding low-quality traces, UA-SCA revises flawed reasoning steps to recover useful information. We provide theoretical guarantees showing that the proposed iterative protocol converges and that uncertainty-aware weighting accelerates convergence. Experiments on multiple reasoning benchmarks show that FERA consistently outperforms both federated training and training-free baselines, achieving progressively higher accuracy across rounds while maintaining communication and computational efficiency.

View PDFOpen arXiv