DLLG: Dynamic Logit-Level Gating of LLM Experts

2026-06-03 • Computation and Language

Computation and Language

AI summaryⓘ

The authors introduce DLLG, a method that combines the strengths of different language models by smartly mixing their word predictions at each step during text generation. Unlike older methods that decide too early or rely on shaky guesses, DLLG learns how to blend models based on overall answer correctness without needing detailed labels or retraining the models. Their experiments show DLLG works better than previous techniques on tasks involving reasoning and coding, across different model sizes. This suggests DLLG is a reliable and efficient way to use multiple expert models together.

Large Language ModelsEnsemblingLogit-level FusionGating MechanismToken-level PredictionModel RoutingParameter MergingSparse SupervisionTrajectory-level CorrectnessSpecialized Experts

Authors

Bingnan Li, Zhaoyang Zhang, Xiaoze Liu, Yantao Shen, Shuli Jiang, Shuo Yang, Wei Xia, Zhuowen Tu, Stefano Soatto

Abstract

Leveraging multiple specialized LLMs can combine complementary strengths, but existing approaches trade adaptability for stability: routing commits prematurely, heuristic ensembling depends on fragile proxies, and parameter merging introduces interference. We propose DLLG (Dynamic Logit-Level Gating), a dynamic logit-level ensembling framework that learns token-level expert fusion from sparse response-level supervision. A lightweight gating module predicts step-wise fusion weights, linking trajectory-level correctness to generation without token-level labels or expert retraining. Across diverse reasoning and code benchmarks, DLLG consistently outperforms strong routing, heuristic ensembling, and parameter-merging baselines across model scales, highlighting learned logit-level fusion as a robust and scalable paradigm for integrating specialized experts.

View PDFOpen arXiv