Variance Reduction for Non-Log-Concave Sampling with Applications to Inverse Problems
2026-06-15 • Machine Learning
Machine LearningArtificial Intelligence
AI summaryⓘ
The authors study how to better sample from complicated probability distributions when the exact calculations needed are expensive and noisy. They analyze variance reduction methods like SGD with momentum, STORM, and PAGE, which are known to help in optimization but have not been well understood for sampling tasks. Their work shows these methods improve convergence rates and reliability when approximating the target distribution, especially in imaging problems. They also provide theoretical guarantees and confirm their results with experiments showing better sample quality under fixed computation constraints.
non-log-concave distributionsstochastic gradientsvariance reductionSGD with momentumSTORMPAGEFisher informationPoincaré inequalityscore-based generative priorsinverse problems
Authors
M. Berk Sahin, Ahmet Ege Tanriverdi, Behzad Sharif, Abolfazl Hashemi
Abstract
Sampling from high-dimensional, non-log-concave distributions with unnormalized densities is a fundamental challenge in machine learning, particularly when the exact gradient of the potential is unavailable and must be approximated via stochastic gradients that exhibit high variance under a fixed budget of gradient computations per iteration. Although variance reduction techniques such as SGD with momentum, STORM, and PAGE have demonstrated improved convergence properties in non-convex optimization, their implications for sampling from non-log-concave distributions remain largely unexplored. In this work, we develop the first unified analysis of these estimators for sampling from non-log-concave distributions. We establish improved non-asymptotic convergence rates in $\varepsilon$-relative Fisher information and, under a Poincaré inequality assumption, in squared total variation distance, and further prove weak convergence to the target distribution. We extend our analysis to solving inverse problems with score-based generative priors. We empirically validate our theory and demonstrate that, under a fixed gradient computations per iteration, variance-reduction techniques consistently improve sample quality in two standard imaging applications.