Generative Drifting for Conditional Medical Image Generation
2026-04-21 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors present GDM, a new method for generating 3D medical images like CT scans from MRI data or incomplete CT scans in one quick step. Their approach balances making images that look realistic for the whole group of patients while keeping the details true to each individual. They do this by using a technique called generative drifting, supported by features from a medical image encoder, and carefully coordinating optimization goals. When tested on common tasks, GDM outperformed other popular methods in accuracy, realism, and speed. This suggests GDM could be a useful tool for medical image generation in clinical settings.
Conditional medical image generation3D volumetric imagingGenerative driftingPatient-specific fidelityDistribution-level plausibilityMRI-to-CT synthesisSparse-view CT reconstructionMulti-objective learningMedical foundation encoderGradient coordination
Authors
Zirong Li, Siyuan Mei, Weiwen Wu, Andreas Maier, Lina Gölz, Yan Xia
Abstract
Conditional medical image generation plays an important role in many clinically relevant imaging tasks. However, existing methods still face a fundamental challenge in balancing inference efficiency, patient-specific fidelity, and distribution-level plausibility, particularly in high-dimensional 3D medical imaging. In this work, we propose GDM, a generative drifting framework that reformulates deterministic medical image prediction as a multi-objective learning problem to jointly promote distribution-level plausibility and patient-specific fidelity while retaining one-step inference. GDM extends drifting to 3D medical imaging through an attractive-repulsive drift that minimizes the discrepancy between the generator pushforward and the target distribution. To enable stable drifting-based learning in 3D volumetric data, GDM constructs a multi-level feature bank from a medical foundation encoder to support reliable affinity estimation and drifting field computation across complementary global, local, and spatial representations. In addition, a gradient coordination strategy in the shared output space improves optimization balance under competing distribution-level and fidelity-oriented objectives. We evaluate the proposed framework on two representative tasks, MRI-to-CT synthesis and sparse-view CT reconstruction. Experimental results show that GDM consistently outperforms a wide range of baselines, including GAN-based, flow-matching-based, and SDE-based generative models, as well as supervised regression methods, while improving the balance among anatomical fidelity, quantitative reliability, perceptual realism, and inference efficiency. These findings suggest that GDM provides a practical and effective framework for conditional 3D medical image generation.