Related papers: Efficient Transformers: A Survey

A Survey of Transformers

Transformers have achieved great success in many artificial intelligence fields, such as natural language processing, computer vision, and audio processing. Therefore, it is natural to attract lots of interest from academic and industry…

Machine Learning · Computer Science 2021-06-16 Tianyang Lin , Yuxin Wang , Xiangyang Liu , Xipeng Qiu

A Practical Survey on Faster and Lighter Transformers

Recurrent neural networks are effective models to process sequences. However, they are unable to learn long-term dependencies because of their inherent sequential nature. As a solution, Vaswani et al. introduced the Transformer, a model…

Machine Learning · Computer Science 2023-03-28 Quentin Fournier , Gaétan Marceau Caron , Daniel Aloise

Survey: Transformer-based Models in Data Modality Conversion

Transformers have made significant strides across various artificial intelligence domains, including natural language processing, computer vision, and audio processing. This success has naturally garnered considerable interest from both…

Image and Video Processing · Electrical Eng. & Systems 2024-08-12 Elyas Rashno , Amir Eskandari , Aman Anand , Farhana Zulkernine

A Quantitative Review on Language Model Efficiency Research

Language models (LMs) are being scaled and becoming powerful. Improving their efficiency is one of the core research topics in neural information processing systems. Tay et al. (2022) provided a comprehensive overview of efficient…

Machine Learning · Computer Science 2023-06-06 Meng Jiang , Hy Dang , Lingbo Tong

Reformer: The Efficient Transformer

Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences. We introduce two techniques to improve the efficiency of…

Machine Learning · Computer Science 2020-02-19 Nikita Kitaev , Łukasz Kaiser , Anselm Levskaya

Transformers in Vision: A Survey

Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies…

Computer Vision and Pattern Recognition · Computer Science 2022-01-20 Salman Khan , Muzammal Naseer , Munawar Hayat , Syed Waqas Zamir , Fahad Shahbaz Khan , Mubarak Shah

A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks

Transformer is a deep neural network that employs a self-attention mechanism to comprehend the contextual relationships within sequential data. Unlike conventional neural networks or updated versions of Recurrent Neural Networks (RNNs) such…

Machine Learning · Computer Science 2023-06-14 Saidul Islam , Hanae Elmekki , Ahmed Elsebai , Jamal Bentahar , Najat Drawel , Gaith Rjoub , Witold Pedrycz

Transformers in Speech Processing: A Survey

The remarkable success of transformers in the field of natural language processing has sparked the interest of the speech-processing community, leading to an exploration of their potential for modeling long-range dependencies within speech…

Computation and Language · Computer Science 2025-06-05 Siddique Latif , Aun Zaidi , Heriberto Cuayahuitl , Fahad Shamshad , Moazzam Shoukat , Muhammad Usama , Junaid Qadir

A Survey on Efficient Training of Transformers

Recent advances in Transformers have come with a huge requirement on computing resources, highlighting the importance of developing efficient training techniques to make Transformer training faster, at lower cost, and to higher accuracy by…

Machine Learning · Computer Science 2023-05-05 Bohan Zhuang , Jing Liu , Zizheng Pan , Haoyu He , Yuetian Weng , Chunhua Shen

Transformers in Reinforcement Learning: A Survey

Transformers have significantly impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural networks. This survey explores how transformers are used in…

Machine Learning · Computer Science 2023-07-13 Pranav Agarwal , Aamer Abdul Rahman , Pierre-Luc St-Charles , Simon J. D. Prince , Samira Ebrahimi Kahou

Transformers in Time Series: A Survey

Transformers have achieved superior performances in many tasks in natural language processing and computer vision, which also triggered great interest in the time series community. Among multiple advantages of Transformers, the ability to…

Machine Learning · Computer Science 2023-05-15 Qingsong Wen , Tian Zhou , Chaoli Zhang , Weiqi Chen , Ziqing Ma , Junchi Yan , Liang Sun

Introduction to Transformers: an NLP Perspective

Transformers have dominated empirical machine learning models of natural language processing. In this paper, we introduce basic concepts of Transformers and present key techniques that form the recent advances of these models. This includes…

Computation and Language · Computer Science 2023-11-30 Tong Xiao , Jingbo Zhu

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Large Language Models (LLMs) have delivered impressive results in language understanding, generation, reasoning, and pushes the ability boundary of multimodal models. Transformer models, as the foundation of modern LLMs, offer a strong…

Computation and Language · Computer Science 2025-08-14 Weigao Sun , Jiaxi Hu , Yucheng Zhou , Jusen Du , Disen Lan , Kexin Wang , Tong Zhu , Xiaoye Qu , Yu Zhang , Xiaoyu Mo , Daizong Liu , Yuxuan Liang , Wenliang Chen , Guoqi Li , Yu Cheng

A Survey on Transformers in Reinforcement Learning

Transformer has been considered the dominating neural architecture in NLP and CV, mostly under supervised settings. Recently, a similar surge of using Transformers has appeared in the domain of reinforcement learning (RL), but it is faced…

Machine Learning · Computer Science 2023-09-22 Wenzhe Li , Hao Luo , Zichuan Lin , Chongjie Zhang , Zongqing Lu , Deheng Ye

On the Universality of Transformer Architectures; How Much Attention Is Enough?

Transformers are crucial across many AI fields, such as large language models, computer vision, and reinforcement learning. This prominence stems from the architecture's perceived universality and scalability compared to alternatives. This…

Machine Learning · Computer Science 2025-12-23 Amirreza Abbasi , Mohsen Hooshmand

Vision Transformers: State of the Art and Research Challenges

Transformers have achieved great success in natural language processing. Due to the powerful capability of self-attention mechanism in transformers, researchers develop the vision transformers for a variety of computer vision tasks, such as…

Computer Vision and Pattern Recognition · Computer Science 2022-07-08 Bo-Kai Ruan , Hong-Han Shuai , Wen-Huang Cheng

Advances in Transformers for Robotic Applications: A Review

The introduction of Transformers architecture has brought about significant breakthroughs in Deep Learning (DL), particularly within Natural Language Processing (NLP). Since their inception, Transformers have outperformed many traditional…

Robotics · Computer Science 2024-12-17 Nikunj Sanghai , Nik Bear Brown

Transformers in Time-series Analysis: A Tutorial

Transformer architecture has widespread applications, particularly in Natural Language Processing and computer vision. Recently Transformers have been employed in various aspects of time-series analysis. This tutorial provides an overview…

Machine Learning · Computer Science 2023-07-27 Sabeen Ahmed , Ian E. Nielsen , Aakash Tripathi , Shamoon Siddiqui , Ghulam Rasool , Ravi P. Ramachandran

Multimodal Learning with Transformers: A Survey

Transformer is a promising neural network learner, and has achieved great success in various machine learning tasks. Thanks to the recent prevalence of multimodal applications and big data, Transformer-based multimodal learning has become a…

Computer Vision and Pattern Recognition · Computer Science 2023-05-11 Peng Xu , Xiatian Zhu , David A. Clifton

Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning

The Transformer is a highly successful deep learning model that has revolutionised the world of artificial neural networks, first in natural language processing and later in computer vision. This model is based on the attention mechanism…

Machine Learning · Computer Science 2023-05-09 Riccardo Ughi , Eugenio Lomurno , Matteo Matteucci