Beyond Uniform Tokens: Adaptive Compression for Time Series Language Models
2026-06-11 • Computation and Language
Computation and Language
AI summaryⓘ
The authors look at how large language models handle time series data and text together, noting that these two kinds of data are very different. They find that many time series tokens contain repetitive information, while only a few hold important signals, and that the influence of text prompts weakens deeper in the model. Using this insight, they create a smart way to compress the time series data and reduce prompt tokens layer by layer. Their method speeds up processing by over 7 times and improves performance in most tests involving forecasting, classification, and anomaly detection.
Large Language ModelsTime Series AnalysisTokensSpectral AnalysisFrequency DomainToken CompressionPrompt TokensModel LayersForecastingAnomaly Detection
Authors
Jialin Gan, Xin Qiu, Guangzhe Chen, Xue Wang
Abstract
Large language models (LLMs) have enabled time series (TS) analysis by jointly modeling numerical observations and textual context through a shared token interface. However, TS tokens and prompt tokens exhibit fundamentally different information structures, making uniform token processing inefficient. In this paper, we study token efficiency in TS language modeling from an asymmetric-token perspective. We show that TS tokens have highly uneven spectral contributions, where many tokens share redundant frequency patterns while a small subset preserves critical temporal evidence. We also observe that prompt-token influence attenuates with model depth, suggesting that full prompt retention across all layers is unnecessary. Based on these findings, we develop an adaptive token budgeting framework that compresses TS tokens via frequency-domain structure and progressively reduces prompt tokens across layers. Experiments across forecasting, classification, imputation, and anomaly detection demonstrate up to \textit{\textbf{7.68$\times$}} inference acceleration and performance gains in \textit{\textbf{78\%}} of evaluated settings, showing the effectiveness of asymmetric token compression for scalable TS foundation models.