Triformer: Triangular, Variable-Specific Attentions for Long Sequence Multivariate Time Series Forecasting

Triformer: Triangular, Variable-Specific Attentions for Long Sequence Multivariate Time Series Forecasting

Razvan-Gabriel Cirstea, Chenjuan Guo, Bin Yang, Tung Kieu, Xuanyi Dong, Shirui Pan

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 1994-2001. https://doi.org/10.24963/ijcai.2022/277

A variety of real-world applications rely on far future information to make decisions, thus calling for efficient and accurate long sequence multivariate time series forecasting. While recent attention-based forecasting models show strong abilities in capturing long-term dependencies, they still suffer from two key limitations. First, canonical self attention has a quadratic complexity w.r.t. the input time series length, thus falling short in efficiency. Second, different variables’ time series often have distinct temporal dynamics, which existing studies fail to capture, as they use the same model parameter space, e.g., projection matrices, for all variables’ time series, thus falling short in accuracy. To ensure high efficiency and accuracy, we propose Triformer, a triangular, variable-specific attention. (i) Linear complexity: we introduce a novel patch attention with linear complexity. When stacking multiple layers of the patch attentions, a triangular structure is proposed such that the layer sizes shrink exponentially, thus maintaining linear complexity. (ii) Variable-specific parameters: we propose a light-weight method to enable distinct sets of model parameters for different variables’ time series to enhance accuracy without compromising efficiency and memory usage. Strong empirical evidence on four datasets from multiple domains justifies our design choices, and it demonstrates that Triformer outperforms state-of-the-art methods w.r.t. both accuracy and efficiency. Source code is publicly available at https://github.com/razvanc92/triformer.
Keywords:
Data Mining: Mining Spatial and/or Temporal Data
Machine Learning: Recurrent Networks
Machine Learning: Regression
Machine Learning: Time-series; Data Streams