New Bounds for the Last Iterate of the Stochastic subGradient Method

2026-06-23Machine Learning

Machine Learning
AI summary

The authors study a simple algorithm called the stochastic subgradient method, used to minimize certain one-dimensional functions. They focus on the performance when stopping at the last point after a fixed number of steps. They prove that if the noise in the method is independent and identically distributed with bounded variance, the error in the final answer is proportional to 1 divided by the square root of the number of steps, removing an extra logarithmic term found in previous results. However, if the noise is not independent, the error can be worse by a logarithmic factor. This means that just assuming bounded variance is not enough to guarantee the best performance, answering a question from earlier work.

stochastic subgradient methodconvex optimizationLipschitz functionoptimization errorfixed stepsizei.i.d. noisebounded variancelast iterateone-dimensional optimization
Authors
Guglielmo Beretta, Tommaso Cesari, Roberto Colomboni, Andrea Paudice
Abstract
We study the last iterate of the stochastic subgradient method for one-dimensional convex Lipschitz objectives. For a fixed horizon $n$, we consider the standard fixed stepsizes $η=Θ(1/\sqrt n)$. We prove that, for such stepsize policies, under additive i.i.d. subgradient noise with uniformly bounded variance, the last iterate features an optimization error of order $1/\sqrt n$, thereby removing the extra $(\log n)$ factor present in existing generic bounds. On the other hand, we show that without the i.i.d. assumption, the optimization error can be of order $(\log n)/\sqrt n$. Thus, under the uniformly bounded variance assumption alone, the last iterate of SsGM is suboptimal even in dimension one, resolving negatively an open problem posed in Koren and Segal, COLT, 2020.