STaT: Resolving Shape Distortion in Non-Stationary Time Series via Tri-Modal Synergy
2026-05-25 • Machine Learning
Machine Learning
AI summaryⓘ
The authors created a new forecasting method called STaT that combines three types of information: symbolic (turning numbers into tokens to spot key patterns), temporal (understanding time order), and textual (using related text for context). This approach helps avoid overly smooth predictions that miss important changes. They tested STaT on eight real-world datasets and found it improved accuracy and better captured the true shape of the data compared to previous methods.
time series forecastingmultimodal learningsymbolic modalitytemporal modalitytextual modalitynon-stationary environmentsstructural patternsshape distortionmagnitude indicators
Authors
Hui Cheng, Jinsheng Guo, Zhenhao Weng, Yan Qiao, Meng Li
Abstract
Recent research in time series forecasting frequently investigates the integration of textual and visual modalities with numerical models to better navigate non-stationary environments. Despite delivering solid numerical results, existing multi-modal approaches usually encounter a dilemma: prioritizing the minimization of average errors can result in excessively smooth forecasts that overlook essential fluctuations. To resolve this limitation, we introduce STaT, an innovative multimodal architecture for Symbolic-Temporal-Textual Alignment, which seamlessly unites three synergistic modalities. Specifically, the symbolic modality converts continuous time series into discrete tokens, facilitating the accurate identification of structural patterns and turning points; the temporal modality extracts inherent sequential dependencies; and the textual modality leverages domain semantics to steer the macroscopic forecasting trends. Comprehensive evaluations on eight real-world benchmarks indicate that STaT delivers exceptional performance, enhancing conventional magnitude indicators by up to 8.9% while simultaneously decreasing shape distortion by up to 8.5%.