RAID: Semantic Graph Diffusion for True Cold-Start and Cross-Lingual Forecasting
2026-06-15 • Artificial Intelligence
Artificial Intelligence
AI summaryⓘ
The authors created RAID, a system designed to predict time series data for items that have no prior history—known as cold-start situations. Instead of relying on past data, RAID uses the item’s metadata (like descriptions) to find similar items and make initial predictions. It then improves these predictions using a process called diffusion to handle uncertainty. RAID works faster than existing models and can even apply what it learns from English descriptions to other languages without extra training.
time series forecastingcold-start problemmetadatasemantic retrievaldiffusion modelsmultilingual embeddingsnon-autoregressive decodinggraph modelscross-lingual transfer
Authors
Arunkumar V, Manoranjan Gandhudi, Gangadharan G. R., Arun Prakash, S. Senthilkumar
Abstract
Time-series foundation models show strong transfer performance when given a non-empty history window. However, true cold-start scenarios, where a new item has no prior observations, violate this assumption. We propose RAID (Retrieval-Augmented Iterative Diffusion) a framework, which replaces history-based correlation learning with metadata-driven semantic retrieval and graph-conditioned diffusion. RAID maps textual metadata into a shared semantic space using a frozen multilingual embedding model and constructs an inductive retrieval graph that extends naturally to unseen items. It first forms a base forecast by aggregating information from semantically related neighbors, then refines this forecast with a gated diffusion module to model residual uncertainty. Under a strict true cold-start protocol, RAID outperforms strong foundation models and competitive baselines on both forecasting accuracy and prediction interval coverage, while reducing inference latency by an order of magnitude through non-autoregressive decoding. The shared semantic space also enables zero-shot cross-lingual transfer, allowing a model trained on English descriptions to generalize to items described in other languages without direct supervision.