The Observable Wasserstein Distance

2026-05-11Machine Learning

Machine Learning
AI summary

The authors introduce the observable Wasserstein distance, a new way to estimate how different two probability distributions are, especially when dealing with complex or large datasets where exact calculations are too slow. Their method projects data onto simpler one-dimensional summaries and measures differences there, creating a step-by-step system that balances accuracy and computation time. They prove that this system can uniquely identify distributions under certain mathematical conditions, similar to a classical result for Euclidean spaces. Finally, they test their approach on grid-based data and show it works well in practice.

Wasserstein distanceoptimal transportPolish metric spaces1-Lipschitz functionspushforward distributionsliced Wasserstein distancemetric covering dimensionCramér-Wold Devicepseudo-metriccomputational efficiency
Authors
Edivaldo Lopes dos Santos, Leandro Vicente Mauri, Washington Mio, Tom Needham
Abstract
We introduce the observable Wasserstein distance, a framework for deriving lower bounds on the Wasserstein distance between probability measures on Polish metric spaces, designed to bypass the computational intractability of exact optimal transport in large-scale, non-Euclidean datasets. Analogous to the sliced Wasserstein distance in $\mathbb{R}^d$, our approach projects measures onto the real line via 1-Lipschitz observables and computes the Wasserstein distances between the resulting pushforward distributions. We define a hierarchy of pseudo-metrics by restricting observables to a nested chain of subspaces. A central theoretical contribution is an injectivity result linking the metric covering dimension of the support of a measure to the specific order in the hierarchy that guarantees unique recovery. This serves as a metric-space analogue to the Cramér-Wold Device for Euclidean distributions. We demonstrate that this hierarchy offers a tunable trade-off between sharpness as a lower bound on the Wasserstein distance and computational efficiency. We also present a discrete computational model for finite grids and numerical experiments validating the efficacy and utility of these approximations.