Plug-and-Play Logit Fusion for Heterogeneous Pathology Foundation Models

2026-04-09Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors address the problem of choosing the best pathology foundation model (FM) for different tasks, as no single model works best everywhere and testing many models is costly. They propose LogitProd, a lightweight method that combines predictions from multiple fixed models by weighting their outputs without needing extra training or complex alignment. Their method theoretically guarantees to perform at least as well as the best single model and shows strong results on 22 benchmarks, improving average performance by about 3%. LogitProd also reduces training cost significantly compared to other fusion methods.

pathology foundation modelscomputational histopathologymodel fusionlogitstransfer learningwhole slide imagesgene mutation predictionsurvival modelingensemble methodsfeature fusion
Authors
Gexin Huang, Anqi Li, Yusheng Tan, Beidi Zhao, Gang Wang, Gaozu Hua, Xiaoxiao Li
Abstract
Pathology foundation models (FMs) have become central to computational histopathology, offering strong transfer performance across a wide range of diagnostic and prognostic tasks. The rapid proliferation of pathology foundation models creates a model-selection bottleneck: no single model is uniformly best, yet exhaustively adapting and validating many candidates for each downstream endpoint is prohibitively expensive. We address this challenge with a lightweight and novel model fusion strategy, LogitProd, which treats independently trained FM-based predictors as fixed experts and learns sample-adaptive fusion weights over their slide-level outputs. The fusion operates purely on logits, requiring no encoder retraining and no feature-space alignment across heterogeneous backbones. We further provide a theoretical analysis showing that the optimal weighted product fusion is guaranteed to perform at least as well as the best individual expert under the training objective. We systematically evaluate LogitProd on \textbf{22} benchmarks spanning WSI-level classification, tile-level classification, gene mutation prediction, and discrete-time survival modeling. LogitProd ranks first on 20/22 tasks and improves the average performance across all tasks by ~3% over the strongest single expert. LogitProd enables practitioners to upgrade heterogeneous FM-based pipelines in a plug-and-play manner, achieving multi-expert gains with $\sim$12$\times$ lower training cost than feature-fusion alternatives.