Report the Floor: A Training-Free Conformal Interval Is a Mandatory Baseline for Probabilistic Time-Series Forecasting
2026-06-08 • Machine Learning
Machine Learning
AI summaryⓘ
The authors show that a very simple method called ConformalNaive, which does not require training, outperforms many common baseline forecasting methods on one-step-ahead predictions across thousands of real-world time series. This simple method even beats some trained neural forecasting models in accuracy and calibration but is only outperformed by more complex adaptive and ensemble methods that account for changing data patterns. For longer-term seasonal forecasts, the pattern reverses with other methods performing better. The authors suggest that this easy-to-use ConformalNaive should always be used as a baseline when testing new forecasting techniques.
probabilistic forecastingconformal predictiontime seriessplit-conformal residual quantilecalibrationbaseline methodsquantile regressionensemble methodsdistribution shiftmulti-step forecasting
Authors
Valery Manokhin
Abstract
Probabilistic forecasters are increasingly learned, yet the baselines they are compared against are often weak or omitted. We show that the simplest possible conformal interval - a last-value point forecast wrapped in a finite-sample split-conformal residual quantile, with no parameters and no training - is a far stronger baseline than its near-total absence from recent learned-forecasting and conformal-time-series comparisons would suggest. In one-step-ahead online forecasting across 2,217 real series from nine public sources (Monash, LOTSA, the LTSF traffic/electricity/weather suites, METR-LA, BOOM, nips/probts), this ConformalNaive interval decisively beats the naive value-quantile baselines, the entire NPTS family (NPTS 73%, SeasonalNPTS 64% of series), and the published Conformal Seasonal Pools (CSP) method (71% of series, bootstrap 95% CI [69,73], paired Wilcoxon p approx 7.6e-135); it is on par with the simpler learned conformal predictors (RCI, quantile regression; median relative Winkler within 2%) and is beaten only by the adaptive-online and ensemble methods (SPCI, ACI, AgACI), which track distribution shift and lead by 9-33% relative Winkler. It is also better calibrated than a trained neural forecaster: on the six datasets that introduced DeepNPTS, the trivial floors cover the truth 84-85% of the time at a nominal 95%, versus DeepNPTS's 66%. At multi-step seasonal horizons the picture inverts: the random-walk floor is the weakest method and the seasonal pool (CSP) wins - a boundary we map. Finally we give ConformalNaive+, a one-line, training-free, horizon-adaptive selector that attains the better of two complementary floors at every horizon with restored coverage. We argue the matching conformal naive floor must be a mandatory baseline whenever a learned probabilistic forecaster claims gains.