CORE-MTL: Rethinking Gradient Balancing via Causal Orthogonal Representations

2026-06-01 • Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionMachine Learning

AI summaryⓘ

The authors propose a new method called CORE-MTL to improve multi-task learning, where one model handles multiple tasks at once. Instead of only adjusting task gradients or shared network parts, their approach separates the shared information into two parts: one containing important task-related info and another for irrelevant details. This helps the model focus on what really matters for each task and avoid confusion. Their method shows better performance, especially when tested on new, different data, without needing complicated gradient tricks.

Multi-task learningShared representationNegative transferCausal representationOut-of-distribution generalizationTask gradientSemantic-residual factorizationVisual domainPhysical priorsGradient interference

Authors

Chengfeng Wu, Tao Zou, Yanru Wu, Jingge Wang

Abstract

Multi-task learning (MTL) aims to construct a joint model for multiple tasks by sharing a common representation across domains. To achieve this goal, existing optimization-centric methods either balance task gradients or modify the shared architecture. However, as these approaches remain agnostic to the content of the shared representation, they fail to disentangle task-relevant structure from spurious context, leading to negative transfer and poor generalization. To overcome this limitation, we propose Causal Orthogonal Representations for Multi-Task Learning (CORE-MTL), a causally motivated representation-centric framework that encourages a structured semantic-residual factorization of the shared representation, concentrating task-relevant structure in the semantic stream while relegating nuisance variation to the residual stream. We instantiate this framework in the visual domain by leveraging physical priors for structured scenes and statistical constraints for attributes. Theoretically, our method enjoys a tighter out-of-distribution generalization bound than optimization-centric methods and reduces task gradient interference without explicit gradient projection or reweighting. Empirically, CORE-MTL consistently outperforms existing methods on visual multi-task benchmarks in both in-distribution and out-of-distribution settings. Code is publicly available at https://github.com/Hope-Rita/CORE-MTL.

View PDFOpen arXiv