Parameterized Complexity of Stationarity Testing for Piecewise-Affine Functions and Shallow CNN Losses

2026-05-11 • Computational Complexity

Computational ComplexityMachine Learning

AI summaryⓘ

The authors study how hard it is to check if a certain smoothness condition (called approximate first-order stationarity) holds for piecewise-affine (PA) functions, which are important in nonsmooth optimization and appear in things like ReLU neural networks. They focus on how the problem's difficulty changes when the dimension of the input space is fixed. They provide algorithms that work well for fixed dimensions in some cases but prove it's still very hard (W[1]-hard) in others, with lower bounds that rule out fast algorithms under common complexity assumptions. Their findings also apply to checking if points are local minima and to training simple shallow CNNs with ReLU activations. Thus, they map out when these problems are tractable or hard depending on the dimension.

piecewise-affine functionapproximate first-order stationarityparameterized complexityfixed dimensionW[1]-hardnessExponential Time Hypothesislocal minimalityReLU neural networksCNN training lossesnonsmooth optimization

Authors

Yuhan Ye

Abstract

We study the parameterized complexity of testing approximate first-order stationarity at a prescribed point for continuous piecewise-affine (PA) functions, a basic task in nonsmooth optimization. PA functions form a canonical model for nonsmooth stationarity testing and capture the local polyhedral geometry that appears in ReLU-type training losses. Recent work by Tian and So (SODA 2025) shows that testing approximate stationarity notions for PA functions is computationally intractable in the worst case, and identifies fixed-dimensional tractability as an open direction. We address this direction from the viewpoint of parameterized complexity, with the ambient dimension $d$ as the parameter. In this paper, we give XP algorithms in fixed dimension for the tractable sides, and prove W[1]-hardness for the complementary sides. Moreover, lower bounds under the Exponential Time Hypothesis rule out algorithms running in time $ρ(d)\size^{o(d)}$ for any computable function $ρ$, where $\size$ denotes the total binary encoding length of the stationarity-testing instance. As a further consequence, our results yield the corresponding parameterized complexity picture for testing local minimality of continuous PA functions. We further extend our hardness results to a family of shallow ReLU CNN training losses, with stationarity tested in the trainable weight space. Thus, the same parameterized-complexity picture also appears for simple CNN training losses.

View PDFOpen arXiv