時間序列預測的核心在於捕捉時間軸上的依賴性與趨勢。採用較長的輸入序列,不僅能幫助模型學習數據中的長期趨勢和週期性模式,還可以使其能更好地擬合的數據漂移。因此對於提升預測性能,使用長輸入序列為不可或缺的要素之一。然而,現有基於 Transformer 的模型在處理長輸入序列時,計算成本將顯著增加。為了解決此類問題,我們提出了一種高效的 Transformer 架構模型——MscTNT。該模型能以較低的計算成本處理更長的歷史數據窗口,透過將輸入序列分為大粒度切片與小粒度子切片,並結合雙層級 Transformer 編碼器堆疊的設計,增強模型的表徵能力。這種設計能夠聚合多尺度特徵,有效學習時間序列中的時序依賴性。MscTNT 的結構設計具有高度靈活性,能在預測精度與訓練及部署的時間空間成本之間實現良好的權衡。透過合理的參數設置,MscTNT 可以實現高效的 token reduction,大幅降低計算成本;而在較高計算成本的設定下,該模型亦能以低於其他模型所需成本的條件下,達到接近 SOTA 的預測精度。
Time series forecasting focuses on capturing dependencies and trends across temporal sequences. Utilizing longer input sequences not only enables the model to learn long-term trends and periodic patterns within the data but also enhances its ability to model the evolving distribution drift. Consequently, the use of long input sequences is an essential factor in enhancing forecasting performance. However, when processing long input sequences, existing Transformer-based models face a significant increase in computational cost. To address this issue, we propose MscTNT, an efficient Transformer-based model capable of handling extended historical windows at reduced costs. MscTNT employs a dual-level Transformer encoder stack, which partitions the input sequence into coarse-grained patches and fine-grained subpatches to enhance representational capacity. This design facilitates the aggregation of multi-scale features and effectively captures temporal dependencies within the data. The structure of MscTNT ensures flexibility, enabling a balanced trade-off between predictive accuracy and the computational cost of training and deployment. By appropriately tuning the model parameters, MscTNT achieves efficient token reduction, substantially mitigating computational expenses. Alternatively, with higher-cost parameter settings, the model attains near state-of-the-art (SOTA) predictive accuracy at a lower computational overhead compared to other models.