Related papers: Stabilizing Transformer-Based Action Sequence Gene…

On Transforming Reinforcement Learning by Transformer: The Development Trajectory

Transformer, originally devised for natural language processing, has also attested significant success in computer vision. Thanks to its super expressive power, researchers are investigating ways to deploy transformers to reinforcement…

Machine Learning · Computer Science 2023-01-24 Shengchao Hu , Li Shen , Ya Zhang , Yixin Chen , Dacheng Tao

Decision Transformer: Reinforcement Learning via Sequence Modeling

We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling…

Machine Learning · Computer Science 2021-06-25 Lili Chen , Kevin Lu , Aravind Rajeswaran , Kimin Lee , Aditya Grover , Michael Laskin , Pieter Abbeel , Aravind Srinivas , Igor Mordatch

A Survey on Transformers in Reinforcement Learning

Transformer has been considered the dominating neural architecture in NLP and CV, mostly under supervised settings. Recently, a similar surge of using Transformers has appeared in the domain of reinforcement learning (RL), but it is faced…

Machine Learning · Computer Science 2023-09-22 Wenzhe Li , Hao Luo , Zichuan Lin , Chongjie Zhang , Zongqing Lu , Deheng Ye

Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning

Interacting with the actual environment to acquire data is often costly and time-consuming in robotic tasks. Model-based offline reinforcement learning (RL) provides a feasible solution. On the one hand, it eliminates the requirements of…

Machine Learning · Computer Science 2023-10-17 Pengqin Wang , Meixin Zhu , Shaojie Shen

Transformers in Reinforcement Learning: A Survey

Transformers have significantly impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural networks. This survey explores how transformers are used in…

Machine Learning · Computer Science 2023-07-13 Pranav Agarwal , Aamer Abdul Rahman , Pierre-Luc St-Charles , Simon J. D. Prince , Samira Ebrahimi Kahou

Stabilizing Transformers for Reinforcement Learning

Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success in natural language processing (NLP),…

Machine Learning · Computer Science 2019-10-16 Emilio Parisotto , H. Francis Song , Jack W. Rae , Razvan Pascanu , Caglar Gulcehre , Siddhant M. Jayakumar , Max Jaderberg , Raphael Lopez Kaufman , Aidan Clark , Seb Noury , Matthew M. Botvinick , Nicolas Heess , Raia Hadsell

A Practical Survey on Faster and Lighter Transformers

Recurrent neural networks are effective models to process sequences. However, they are unable to learn long-term dependencies because of their inherent sequential nature. As a solution, Vaswani et al. introduced the Transformer, a model…

Machine Learning · Computer Science 2023-03-28 Quentin Fournier , Gaétan Marceau Caron , Daniel Aloise

A Comparison Between Decision Transformers and Traditional Offline Reinforcement Learning Algorithms

The field of Offline Reinforcement Learning (RL) aims to derive effective policies from pre-collected datasets without active environment interaction. While traditional offline RL algorithms like Conservative Q-Learning (CQL) and Implicit…

Machine Learning · Computer Science 2025-11-21 Ali Murtaza Caunhye , Asad Jeewa

Adaptive Transformers in RL

Recent developments in Transformers have opened new interesting areas of research in partially observable reinforcement learning tasks. Results from late 2019 showed that Transformers are able to outperform LSTMs on both memory intense and…

Machine Learning · Computer Science 2020-04-09 Shakti Kumar , Jerrod Parker , Panteha Naderian

Transformers for Supervised Online Continual Learning

Transformers have become the dominant architecture for sequence modeling tasks such as natural language processing or audio processing, and they are now even considered for tasks that are not naturally sequential such as image…

Machine Learning · Computer Science 2024-03-05 Jorg Bornschein , Yazhe Li , Amal Rannen-Triki

QT-TDM: Planning With Transformer Dynamics Model and Autoregressive Q-Learning

Inspired by the success of the Transformer architecture in natural language processing and computer vision, we investigate the use of Transformers in Reinforcement Learning (RL), specifically in modeling the environment's dynamics using…

Machine Learning · Computer Science 2024-11-19 Mostafa Kotb , Cornelius Weber , Muhammad Burhan Hafez , Stefan Wermter

Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

While agents trained by Reinforcement Learning (RL) can solve increasingly challenging tasks directly from visual observations, generalizing learned skills to novel environments remains very challenging. Extensive use of data augmentation…

Machine Learning · Computer Science 2021-12-10 Nicklas Hansen , Hao Su , Xiaolong Wang

R-Transformer: Recurrent Neural Network Enhanced Transformer

Recurrent Neural Networks have long been the dominating choice for sequence modeling. However, it severely suffers from two issues: impotent in capturing very long-term dependencies and unable to parallelize the sequential computation…

Machine Learning · Computer Science 2019-07-15 Zhiwei Wang , Yao Ma , Zitao Liu , Jiliang Tang

Transformers predicting the future. Applying attention in next-frame and time series forecasting

Recurrent Neural Networks were, until recently, one of the best ways to capture the timely dependencies in sequences. However, with the introduction of the Transformer, it has been proven that an architecture with only attention-mechanisms…

Machine Learning · Computer Science 2021-08-19 Radostin Cholakov , Todor Kolev

A New Perspective on Transformers in Online Reinforcement Learning for Continuous Control

Despite their effectiveness and popularity in offline or model-based reinforcement learning (RL), transformers remain underexplored in online model-free RL due to their sensitivity to training setups and model design decisions such as how…

Machine Learning · Computer Science 2025-10-16 Nikita Kachaev , Daniil Zelezetsky , Egor Cherepanov , Alexey K. Kovelev , Aleksandr I. Panov

A Comparative Study on Transformer vs RNN in Speech Applications

Sequence-to-sequence models have been widely used in end-to-end speech processing, for example, automatic speech recognition (ASR), speech translation (ST), and text-to-speech (TTS). This paper focuses on an emergent sequence-to-sequence…

Computation and Language · Computer Science 2021-06-10 Shigeki Karita , Nanxin Chen , Tomoki Hayashi , Takaaki Hori , Hirofumi Inaguma , Ziyan Jiang , Masao Someki , Nelson Enrique Yalta Soplin , Ryuichi Yamamoto , Xiaofei Wang , Shinji Watanabe , Takenori Yoshimura , Wangyou Zhang

A Survey of Retentive Network

Retentive Network (RetNet) represents a significant advancement in neural network architecture, offering an efficient alternative to the Transformer. While Transformers rely on self-attention to model dependencies, they suffer from high…

Computation and Language · Computer Science 2025-06-10 Haiqi Yang , Zhiyuan Li , Yi Chang , Yuan Wu

Bootstrapped Transformer for Offline Reinforcement Learning

Offline reinforcement learning (RL) aims at learning policies from previously collected static trajectory data without interacting with the real environment. Recent works provide a novel perspective by viewing offline RL as a generic…

Machine Learning · Computer Science 2022-10-19 Kerong Wang , Hanye Zhao , Xufang Luo , Kan Ren , Weinan Zhang , Dongsheng Li

Advances in Transformers for Robotic Applications: A Review

The introduction of Transformers architecture has brought about significant breakthroughs in Deep Learning (DL), particularly within Natural Language Processing (NLP). Since their inception, Transformers have outperformed many traditional…

Robotics · Computer Science 2024-12-17 Nikunj Sanghai , Nik Bear Brown

Offline Reinforcement Learning as One Big Sequence Modeling Problem

Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models, leveraging the Markov property to factorize problems in time. However, we can also view RL as a generic sequence modeling problem,…

Machine Learning · Computer Science 2021-11-30 Michael Janner , Qiyang Li , Sergey Levine