Network Learning with Semi-relaxed Gromov-Wasserstein

2026-06-01Machine Learning

Machine Learning
AI summary

The authors tackle the tough problem of figuring out hidden connections in big networks, which is usually very hard because nodes don't have fixed labels. They simplify this by allowing uncertain matches between parts of the networks, using a math tool called the semi-relaxed Gromov-Wasserstein approach. Their method finds a simpler representation of the network's structure and solves it efficiently with a special algorithm. They prove that their relaxed solution is almost as good as a perfect one when the network is large and show it works well on various network models and real data.

Generative network modelsLatent connectivityNP-hard problemsProbabilistic couplingGromov-Wasserstein distanceSemi-relaxed optimizationBlock-coordinate conditional gradientStochastic block modelsGraphonsConsistency and convergence rates
Authors
Charles Dufour, Ulysse Naepels, Leonardo V. Santoro
Abstract
Estimating the generative mechanism of large-scale networks is a fundamental challenge in statistical machine learning. It requires the identification of the latent connectivity structure, which is in general an NP-hard combinatorial problem due to the absence of canonical node labels. We address this challenge by allowing for probabilistic couplings, thereby relaxing the assignment problem. Our estimation framework can be formulated as a semi-relaxed Gromov-Wasserstein objective and provides a low-dimensional representation of the generative structure. We solve this via a block-coordinate conditional gradient algorithm. Despite the relaxation, the resulting solution is typically deterministic: in fact, we show that the optimality gap between the relaxed solution and the deterministic assignment vanishes at rate $O(1/n)$, where $n$ is the number of nodes. This allows for tractable recovery of the underlying model and enables rigorous statistical analysis: we establish consistency and minimax-optimal convergence rates for both stochastic block models and Holder-smooth graphons. Our implementation scales efficiently with $n$, as demonstrated on both synthetic and real-world datasets.