AI summaryⓘ
The authors present a broad theory for a type of statistical method called regularized M-estimation using reproducing kernel Hilbert spaces, which are mathematical spaces used for learning functions. They prove that their estimator exists and behaves well for various types of loss functions, including some that are robust to outliers. Their work includes precise results on how fast their estimator improves with more data, separating error into bias and variance with a new way to measure complexity. They also explore special function spaces that help avoid problems related to high dimensions, and their approach uses advanced math tools without relying on simplifying assumptions about the problem. The authors support their theory with computer implementations and experiments.
Regularized M-estimationReproducing Kernel Hilbert Space (RKHS)Convex and non-convex loss functionsBias-variance decompositionSobolev spacesDominating mixed smoothnessCurse of dimensionalityFunctional analysisEmpirical process theoryAsymptotic linearization
Abstract
We develop a comprehensive theory for regularized M-estimation in reproducing kernel Hilbert spaces. Under mild conditions on the loss we establish existence and measurability of the estimator, covering a wide range of convex and non-convex losses, including bounded robust losses. We further prove sharp rates of convergence with an explicit bias-variance decomposition governed by a novel complexity measure. We show that the variance is independent of misspecification, while the bias depends on a source condition parameter known in the learning literature. For tensor product Sobolev spaces we obtain new rates that connect to spaces of functions with dominating mixed smoothness, substantially extending existing results and explaining why these estimators circumvent the curse of dimensionality. Our methodology, combining elements from both functional analysis and empirical process theory, allows for an asymptotic linearisation of the objective function that avoids both closed-form solutions and global Lipschitz assumptions, and may be of independent interest. The estimators are implemented in C++ and theory is supported by numerical experiments.