ResMerge: Residual-based Spectral Merging of Large Language Models
2026-06-01 • Computation and Language
Computation and Language
AI summaryⓘ
The authors studied how to combine multiple expert models trained with reinforcement learning without retraining. They found that breaking down each model’s task vector into two parts—a main 'head' and a 'residual'—works better than treating the main part as the only important part. The main head has strong but sometimes conflicting signals, while the residual is more stable for merging. They created ResMerge, a method that first merges the stable residual parts and then carefully adds back the main head information, improving the combined model's performance. They tested ResMerge on various tasks and showed it outperforms existing merging methods.
model mergingreinforcement learningspectral decompositiontask vectorssingular value decompositionFrobenius sphereexpert modelscross-expert conflictsconsensus directionresidual components
Authors
Yandu Sun, Zhiyan Hou, Haokai Ma, Yuheng Jia, Junfeng Fang, Haiyun Guo, Hongyan An, weizhen wang, Jinqiao Wang
Abstract
Model merging offers a training-free way to combine multiple post-trained expert models, but merging experts obtained through reinforcement learning (RL) remains challenging. Existing spectral merging methods often assume that leading singular directions contain the main task signal, while lower-energy residual components can be compressed, selected, or attenuated to reduce interference. We find that this assumption does not hold for RL task vectors: after decomposing each task vector into a leading spectral head and a residual component, both parts can independently recover substantial behavior knowledge, while exhibiting different merging properties. The head is highly concentrated and informative but more prone to sharp cross-expert conflicts, whereas the residual component is more dispersed and provides a more stable basis for aggregation. Based on this observation, we propose ResMerge, a residual-based spectral merging framework for RL experts. ResMerge first constructs a stable residual backbone with Spherical Residual Consensus Adaptation, which estimates a reliability-weighted consensus direction on the Frobenius sphere. It then reintroduces leading-head information through a Lightweight Head Correction module gated by positive cross-expert agreement. Experiments across multiple RL expert groups and capability domains show that ResMerge better preserves expert capabilities than representative task-vector and spectral merging baselines. The implementation of ResMerge is publicly available at https://github.com/sunyd0303-cpu/ResMerge-release.