Cheap Thrills: Effective Amortized Optimization Using Inexpensive Labels
2026-03-05 • Machine Learning
Machine Learning
AI summaryⓘ
The authors study how to speed up solving complex optimization and simulation problems using machine learning models that predict solutions. They note that existing methods either need expensive, very accurate data or are hard to train. To improve this, the authors propose a three-step method: first gather cheap but imperfect data, then train the model with this data, and finally refine it using self-supervised learning to boost accuracy. Their theory and tests on different hard problems show this approach works well, achieving faster and better results with much less training cost.
optimizationsimulationmachine learning surrogatesupervised learningself-supervised learningfeasibilitynonconvex optimizationpower-grid operationdynamical systemstraining cost
Authors
Khai Nguyen, Petros Ellinas, Anvita Bhagavathula, Priya Donti
Abstract
To scale the solution of optimization and simulation problems, prior work has explored machine-learning surrogates that inexpensively map problem parameters to corresponding solutions. Commonly used approaches, including supervised and self-supervised learning with either soft or hard feasibility enforcement, face inherent challenges such as reliance on expensive, high-quality labels or difficult optimization landscapes. To address their trade-offs, we propose a novel framework that first collects "cheap" imperfect labels, then performs supervised pretraining, and finally refines the model through self-supervised learning to improve overall performance. Our theoretical analysis and merit-based criterion show that labeled data need only place the model within a basin of attraction, confirming that only modest numbers of inexact labels and training epochs are required. We empirically validate our simple three-stage strategy across challenging domains, including nonconvex constrained optimization, power-grid operation, and stiff dynamical systems, and show that it yields faster convergence; improved accuracy, feasibility, and optimality; and up to 59x reductions in total offline cost.