Analytical Evaluation of DCA Convergence Properties for Minimizing Prediction Functions of Gaussian RBF Support Vector Regression

2026-06-02Machine Learning

Machine Learning
AI summary

The authors present a way to solve tricky math problems that come up when using a certain type of machine learning model called RBF-SVR. They show how to break down the problem into simpler parts using a method called the DC algorithm by analyzing the math inside the model. They find two important numbers, μ and L, that help understand how quickly this method works and how it depends on the model's settings. Their experiments show that a single combined value, Cαρ, which depends on the training settings, controls the algorithm's behavior. This means you can predict how well the solving method will perform just by looking at model parameters before and after training.

Nonconvex optimizationSupport Vector Regression (SVR)Radial Basis Function (RBF) kernelDifference of Convex functions Algorithm (DCA)Strong ConvexityLipschitz constantDual-coefficientHyperparametersConvergence propertiesDC decomposition
Authors
Yohei Kakimoto, Yuto Omae, Hirotaka Takahashi
Abstract
For nonconvex optimization problems whose objective is the prediction function of a trained Support Vector Regression (SVR) model with the Gaussian radial basis function (RBF) kernel (RBF-SVR), we present a framework that applies the difference of convex functions (DC) algorithm (DCA) by exploiting the analytical structure of the RBF kernel to construct an explicit DC decomposition. Specifically, we derive in closed form both the lower bound $μ$ of the strong convexity parameter of the DC components and the upper bound $L$ of the gradient Lipschitz constant of the subproblem. Both $μ$ and $L$ are determined solely by the post-training dual-coefficient sum $C_α$ and the RBF kernel parameter $γ$, together with the DC decomposition parameter $ρ$, and they share a common leading term $C_αρ$. Through numerical experiments on six benchmark functions, we show that $C_αρ$ is the primary single quantity characterizing both the convergence properties and the initial-point dependence of DCA, and further demonstrate that it decomposes into two independent pathways, $C \to C_α$ and $γ\to ρ$, with its primary variation governed by the SVR hyperparameters $(C, γ)$. Together, these results allow the convergence properties of DCA on RBF-SVR to be assessed in advance through the single scalar quantity $C_αρ$: approximately from $(C, γ)$ before training, and exactly in closed form after training.