Low-rank Distributional Matrix Completion

2026-06-02Machine Learning

Machine Learning
AI summary

The authors study a version of matrix completion where each entry in the matrix is a probability distribution, not just a single number. They only see some entries and get limited samples from those distributions. To handle this, they use a technique called kernel mean embeddings and define a new form of low-rank structure for these distribution matrices. They develop a new method to complete the matrix and prove how well their method works theoretically. Experiments on simulated and real data show their approach is effective.

matrix completionprobability distributionskernel mean embeddingslow-rank structureTucker rankfunctional unfolding operatorsnon-asymptotic error boundsstatistical estimationtensor decompositionsampling
Authors
Jiayi Wang, Raymond K. W. Wong
Abstract
We study a distributional generalization of the matrix completion problem in which each entry of the target matrix is a probability distribution rather than a scalar. In this setting, only a subset of matrix entries is observed, and even for observed entries, the underlying distributions are not directly accessible; instead, we observe finitely many samples drawn from them. To represent distributional entries, we employ kernel mean embeddings and introduce a notion of Tucker rank for distribution-valued matrices to capture their low-rank structure. The infinite-dimensional nature of kernel embeddings poses significant methodological challenges. To address this, we introduce functional unfolding operators that link the proposed distributional low-rank structure to the classical Tucker rank for finite-dimensional tensors. Based on this framework, we propose a novel estimator for distributional matrix completion. We establish non-asymptotic error bounds that characterize the statistical performance of the estimator. Extensive experiments on synthetic data and a real-world application demonstrate the effectiveness of the proposed method.