A Biconvex Formulation for Stable Transport of Mixture Models with a Unique Solution

2026-06-01Machine Learning

Machine Learning
AI summary

The authors propose a new method called Optimal Mixture Transport (OMT) to simplify the problem of moving data between different distributions. Instead of looking at individual data points, they group data into mixtures of subpopulations, which makes calculations faster and easier to understand. They provide mathematical proof that their method remains stable even when the input data changes slightly. OMT works well on both simulated and real large datasets, including images and biological data.

Optimal TransportProbability DistributionsBiconvex OptimizationMixture ModelsExponential-family DistributionsTransport PlanStability AnalysisSingle-cell RNA SequencingScalabilityData Mapping
Authors
Yeganeh Marghi, Kelly Jin, Uygar Sümbül
Abstract
Optimal transport (OT) provides a principled framework for mapping between probability distributions. Despite extensive progress, applying OT to large-scale data remains computationally demanding, and the resulting pointwise transport plans are often difficult to interpret. We introduce Optimal Mixture Transport (OMT), a scalable framework that shifts the transport paradigm from individual samples to mixtures of subpopulations, reformulating the transport problem as a strictly biconvex optimization with a unique global minimizer. We further establish theoretical guarantees on the stability of the OMT map, showing that bounded perturbations of the underlying distributions lead to bounded changes in the transport plan. By formulating subpopulations as exponential-family distributions, OMT decouples computational complexity from the sample size, scaling solely with the number of mixture components. We demonstrate the effectiveness and practicality of OMT on a wide range of synthetic benchmarks and real-world datasets, including image data and large-scale single-cell RNA sequencing measurements.