Beyond Uniform Tokens: Adaptive Compression for Time Series Language Models
English summary
The paper studies token efficiency in time series (TS) language models from an asymmetric-token perspective, revealing that TS tokens contain highly redundant frequency patterns while only a small subset carries critical temporal evidence, and that prompt token influence attenuates with model depth. The authors propose an adaptive token budgeting framework that compresses TS tokens via frequency-domain structure and progressively reduces prompt tokens across layers. Evaluated on forecasting, classification, imputation, and anomaly detection, the method achieves up to 7.68× inference acceleration and performance gains in 78% of settings, demonstrating the effectiveness of asymmetric token compression for scalable TS foundation models.
Chinese summary
论文从非对称令牌视角研究时间序列语言模型中的令牌效率,发现时间序列令牌存在高度冗余的频率模式,仅少数令牌保留关键时序信息,且提示令牌的影响随模型深度衰减。作者提出一种自适应令牌预算框架,通过频域结构压缩时间序列令牌,并逐层渐进减少提示令牌。在预测、分类、插补和异常检测任务上,该方法实现最高7.68倍推理加速,并在78%的评估场景中带来性能提升,验证了非对称令牌压缩对可扩展时间序列基础模型的有效性。
Key points
Time series tokens exhibit uneven spectral contributions: many share redundant frequency patterns, while a small subset preserves critical temporal evidence.
时间序列令牌的频谱贡献不均匀:大量令牌共享冗余频率模式,仅少量令牌保留关键时序信息。
Prompt token influence attenuates with model depth, making full prompt retention across all layers unnecessary.
提示令牌的影响随模型深度增加而减弱,无需在所有层保留完整提示。
An adaptive token budgeting framework is proposed that compresses TS tokens via frequency-domain structure and progressively reduces prompt tokens across layers.
提出自适应令牌预算框架,通过频域结构压缩时间序列令牌并逐层渐进减少提示令牌。
The method achieves up to 7.68× inference acceleration and performance gains in 78% of evaluated settings across forecasting, classification, imputation, and anomaly detection.
在预测、分类、插补和异常检测等任务上,该方法实现最高7.68倍推理加速,并在78%的评估场景中提升性能。