Unified Multimodal Model for Brain MRI Imputation and Understanding
2026-06-15 • Computer Vision and Pattern Recognition
Computer Vision and Pattern RecognitionArtificial IntelligenceMultimedia
AI summaryⓘ
The authors developed UniBrain, a new AI model that analyzes brain MRI scans to help with medical diagnosis. It can work even when some types of MRI images are missing by learning to fill in the gaps and understand the brain images together. They trained the model in a special way that helps it learn detailed brain features without needing lots of detailed descriptions. Tests showed UniBrain performs well on brain image completion and disease diagnosis, even with incomplete data.
Multimodal Large Language ModelsBrain MRIModality ImputationMedical Imaging AnalysisAutoregressive TrainingSelf-AlignmentDense Image EmbeddingsExposure BiasMultimodal InferenceDisease Diagnosis
Authors
Zhiyun Song, Che Liu, Tian Xia, Avinash Kori, Wenjia Bai
Abstract
Multimodal large language models (MLLMs) hold great potential for medicine, as they inherit knowledge from LLM and allow multiple data modalities to be integrated, analysed and interpreted in natural language. However, the field of medical MLLMs is constrained by non-trivial challenges, notably the scarcity of high-quality training data and the frequent occurrence of missing data in the real-world clinical setting. Here, we propose a novel unified multimodal model, UniBrain, for brain magnetic resonance image (MRI) analysis. To address potential missing brain MRI modalities, we employ a unified training strategy to perform joint imaging modality imputation and brain image understanding. During training, an interleaved and description-enriched data flow is constructed to train the model in an autoregressive manner, enabling medical reasoning with generated multimodal data. A self-alignment strategy is introduced to leverage dense image embeddings to learn fine-grained anatomical features without requiring detailed image captions. Furthermore, we propose a dynamic hidden state mechanism to alleviate the exposure bias during long-context multimodal inference. Extensive experiments on multi-disease brain MRI dataset demonstrate that UniBrain achieves high performance for brain image imputation, understanding, and disease diagnosis under various extents of modality incompleteness.