Convex Distance Operator Transport: A Convex and Geometry-Preserving Formulation

2026-06-01Machine Learning

Machine Learning
AI summary

The authors present a new method called Convex Distance Operator Transport (CDOT) that helps match data distributions from different sources while keeping important relationships intact. Their approach uses special operators to better handle local shape changes in the data, making the matching process more reliable. They prove mathematically that CDOT is a proper way to measure differences between datasets and explain why it is more stable than a previous method called Gromov-Wasserstein. The authors also provide theoretical guarantees for how well CDOT works and show through experiments that it performs better on various tasks like comparing brain networks and classifying graphs.

Optimal TransportConvex OptimizationMetric-Measure SpacesGromov-Wasserstein DistanceRegularizationFrank-Wolfe AlgorithmGraph MatchingGeometric Data AnalysisRisk BoundConditional Expectation Operator
Authors
Junhyoung Chung, Euijong Song, Won Hwa Kim, Gunwoong Park
Abstract
We introduce Convex Distance Operator Transport (CDOT), the first convex optimal transport framework that aligns distributions across heterogeneous domains by jointly preserving feature correspondence and intrinsic geometric structure. Specifically, CDOT employs an operator-based regularization that aligns aggregated distance structures by introducing distance and conditional expectation operators. Consequently, the proposed regularization improves the robustness to local geometric variations. We further prove that the resulting CDOT discrepancy is a valid pseudometric on the space of attributed compact metric-measure spaces. In addition, we characterize the relationship between CDOT and Gromov--Wasserstein (GW) through a new notion of dispersion gap, formally elucidating the geometric source of non-convexity in GW compared to the convexity of CDOT. In the finite-sample regime, we derive a non-asymptotic risk bound decomposed into optimization and statistical errors, establishing risk consistency under a globally convergent Frank--Wolfe algorithm. Experiments on synthetic point clouds, brain connectomes, and graph classification benchmarks demonstrate better performance over existing methods, with stable and reliable behavior in practice.