Related papers: Sequence Complementor: Complementing Transformers …

Transformers predicting the future. Applying attention in next-frame and time series forecasting

Recurrent Neural Networks were, until recently, one of the best ways to capture the timely dependencies in sequences. However, with the introduction of the Transformer, it has been proven that an architecture with only attention-mechanisms…

Machine Learning · Computer Science 2021-08-19 Radostin Cholakov , Todor Kolev

Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning

The Transformer is a highly successful deep learning model that has revolutionised the world of artificial neural networks, first in natural language processing and later in computer vision. This model is based on the attention mechanism…

Machine Learning · Computer Science 2023-05-09 Riccardo Ughi , Eugenio Lomurno , Matteo Matteucci

iTransformer: Inverted Transformers Are Effective for Time Series Forecasting

The recent boom of linear forecasting models questions the ongoing passion for architectural modifications of Transformer-based forecasters. These forecasters leverage Transformers to model the global dependencies over temporal tokens of…

Machine Learning · Computer Science 2024-03-15 Yong Liu , Tengge Hu , Haoran Zhang , Haixu Wu , Shiyu Wang , Lintao Ma , Mingsheng Long

Introducing Spectral Attention for Long-Range Dependency in Time Series Forecasting

Sequence modeling faces challenges in capturing long-range dependencies across diverse tasks. Recent linear and transformer-based forecasters have shown superior performance in time series forecasting. However, they are constrained by their…

Machine Learning · Computer Science 2024-11-25 Bong Gyun Kang , Dongjun Lee , HyunGi Kim , DoHyun Chung , Sungroh Yoon

Patch-Level Tokenization with CNN Encoders and Attention for Improved Transformer Time-Series Forecasting

Transformer-based models have shown strong performance in time-series forecasting by leveraging self-attention to model long-range temporal dependencies. However, their effectiveness depends critically on the quality and structure of input…

Machine Learning · Computer Science 2026-02-11 Saurish Nagrath , Saroj Kumar Panigrahy

Attention as Robust Representation for Time Series Forecasting

Time series forecasting is essential for many practical applications, with the adoption of transformer-based models on the rise due to their impressive performance in NLP and CV. Transformers' key feature, the attention mechanism,…

Machine Learning · Computer Science 2024-02-09 PeiSong Niu , Tian Zhou , Xue Wang , Liang Sun , Rong Jin

sTransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting

In recent years, numerous Transformer-based models have been applied to long-term time-series forecasting (LTSF) tasks. However, recent studies with linear models have questioned their effectiveness, demonstrating that simple linear layers…

Machine Learning · Computer Science 2024-08-20 Jiaheng Yin , Zhengxin Shi , Jianshen Zhang , Xiaomin Lin , Yulin Huang , Yongzhi Qi , Wei Qi

A Practical Survey on Faster and Lighter Transformers

Recurrent neural networks are effective models to process sequences. However, they are unable to learn long-term dependencies because of their inherent sequential nature. As a solution, Vaswani et al. introduced the Transformer, a model…

Machine Learning · Computer Science 2023-03-28 Quentin Fournier , Gaétan Marceau Caron , Daniel Aloise

Sentinel: Multi-Patch Transformer with Temporal and Channel Attention for Time Series Forecasting

Transformer-based time series forecasting has recently gained strong interest due to the ability of transformers to model sequential data. Most of the state-of-the-art architectures exploit either temporal or inter-channel dependencies,…

Machine Learning · Computer Science 2025-03-25 Davide Villaboni , Alberto Castellini , Ivan Luciano Danesi , Alessandro Farinelli

Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution

Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning. Transformer models have been adopted to deliver high prediction capacity because of the high computational…

Machine Learning · Computer Science 2023-01-06 Yan Li , Xinjiang Lu , Haoyi Xiong , Jian Tang , Jiantao Su , Bo Jin , Dejing Dou

Non-stationary Transformers: Exploring the Stationarity in Time Series Forecasting

Transformers have shown great power in time series forecasting due to their global-range modeling ability. However, their performance can degenerate terribly on non-stationary real-world data in which the joint distribution changes over…

Machine Learning · Computer Science 2023-11-27 Yong Liu , Haixu Wu , Jianmin Wang , Mingsheng Long

Filter then Attend: Improving attention-based Time Series Forecasting with Spectral Filtering

Transformer-based models are at the forefront in long time-series forecasting (LTSF). While in many cases, these models are able to achieve state of the art results, they suffer from a bias toward low-frequencies in the data and high…

Machine Learning · Computer Science 2026-05-13 Elisha Dayag , Nhat Thanh Van Tran , Jack Xin

Multi-Task Time Series Forecasting With Shared Attention

Time series forecasting is a key component in many industrial and business decision processes and recurrent neural network (RNN) based models have achieved impressive progress on various time series forecasting tasks. However, most of the…

Machine Learning · Computer Science 2021-01-26 Zekai Chen , Jiaze E , Xiao Zhang , Hao Sheng , Xiuzheng Cheng

DRAformer: Differentially Reconstructed Attention Transformer for Time-Series Forecasting

Time-series forecasting plays an important role in many real-world scenarios, such as equipment life cycle forecasting, weather forecasting, and traffic flow forecasting. It can be observed from recent research that a variety of…

Machine Learning · Computer Science 2022-06-14 Benhan Li , Shengdong Du , Tianrui Li , Jie Hu , Zhen Jia

Transformers for Supervised Online Continual Learning

Transformers have become the dominant architecture for sequence modeling tasks such as natural language processing or audio processing, and they are now even considered for tasks that are not naturally sequential such as image…

Machine Learning · Computer Science 2024-03-05 Jorg Bornschein , Yazhe Li , Amal Rannen-Triki

MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal and Channel Mixing

Multivariate time series forecasting has been widely used in various practical scenarios. Recently, Transformer-based models have shown significant potential in forecasting tasks due to the capture of long-range dependencies. However,…

Machine Learning · Computer Science 2023-02-10 Zhe Li , Zhongwen Rao , Lujia Pan , Zenglin Xu

HT-Transformer: Event Sequences Classification by Accumulating Prefix Information with History Tokens

Deep learning has achieved remarkable success in modeling sequential data, including event sequences, temporal point processes, and irregular time series. Recently, transformers have largely replaced recurrent networks in these tasks.…

Machine Learning · Computer Science 2025-08-05 Ivan Karpukhin , Andrey Savchenko

Recency Biased Causal Attention for Time-series Forecasting

Recency bias is a useful inductive prior for sequential modeling: it emphasizes nearby observations and can still allow longer-range dependencies. Standard Transformer attention lacks this property, relying on all-to-all interactions that…

Machine Learning · Computer Science 2026-04-23 Kareem Hegazy , Michael W. Mahoney , N. Benjamin Erichson

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Many real-world applications require the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) demands a high prediction capacity of the model, which is the ability…

Machine Learning · Computer Science 2021-03-30 Haoyi Zhou , Shanghang Zhang , Jieqi Peng , Shuai Zhang , Jianxin Li , Hui Xiong , Wancai Zhang

Minimal Time Series Transformer

Transformer is the state-of-the-art model for many natural language processing, computer vision, and audio analysis problems. Transformer effectively combines information from the past input and output samples in auto-regressive manner so…

Machine Learning · Computer Science 2025-03-14 Joni-Kristian Kämäräinen