Adam-HNAG: A Convergent Reformulation of Adam with Accelerated Rate

2026-04-09Machine Learning

Machine Learning
AI summary

The authors study Adam, a popular optimization method used in machine learning, and address the lack of a complete theoretical understanding of how it works, especially when using the entire dataset at once. They create a new version called Adam-HNAG that has guaranteed convergence in smooth and convex problems by carefully separating parts of the algorithm and adjusting the gradient updates. Their results include both continuous and step-by-step versions of the method, with mathematical proofs showing the algorithm's reliability and faster convergence. They also provide experiments that support their theoretical findings. This work is the first to rigorously prove convergence for Adam-type methods in these settings.

Adam optimizeradaptive preconditioningmomentumconvex optimizationLyapunov functiongradient correctionoperator splittingconvergence analysissmooth optimizationcontinuous-time flow
Authors
Yaxin Yu, Long Chen, Zeyi Xu
Abstract
Adam has achieved strong empirical success, but its theory remains incomplete even in the deterministic full-batch setting, largely because adaptive preconditioning and momentum are tightly coupled. In this work, a convergent reformulation of full-batch Adam is developed by combining variable and operator splitting with a curvature-aware gradient correction. This leads to a continuous-time Adam-HNAG flow with an exponentially decaying Lyapunov function, as well as two discrete methods: Adam-HNAG, and Adam-HNAG-s, a synchronous variant closer in form to Adam. Within a unified Lyapunov analysis framework, convergence guarantees are established for both methods in the convex smooth setting, including accelerated convergence. Numerical experiments support the theory and illustrate the different empirical behavior of the two discretizations. To the best of our knowledge, this provides the first convergence proof for Adam-type methods in convex optimization.