English
Related papers

Related papers: Efficiency 360: Efficient Vision Transformers

200 papers

Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies…

Computer Vision and Pattern Recognition · Computer Science 2022-01-20 Salman Khan , Muzammal Naseer , Munawar Hayat , Syed Waqas Zamir , Fahad Shahbaz Khan , Mubarak Shah

Vision Transformer (ViT) architectures are becoming increasingly popular and widely employed to tackle computer vision applications. Their main feature is the capacity to extract global information through the self-attention mechanism,…

Computer Vision and Pattern Recognition · Computer Science 2024-05-06 Lorenzo Papa , Paolo Russo , Irene Amerini , Luping Zhou

Self-attention in Transformers comes with a high computational cost because of their quadratic computational complexity, but their effectiveness in addressing problems in language and vision has sparked extensive research aimed at enhancing…

Computer Vision and Pattern Recognition · Computer Science 2025-02-25 Tobias Christian Nauen , Sebastian Palacio , Federico Raue , Andreas Dengel

After their initial success in natural language processing, transformer architectures have rapidly gained traction in computer vision, providing state-of-the-art results for tasks such as image classification, detection, segmentation, and…

Computer Vision and Pattern Recognition · Computer Science 2022-03-21 Hugo Touvron , Matthieu Cord , Alaaeldin El-Nouby , Jakob Verbeek , Hervé Jégou

Transformers have achieved great success in natural language processing. Due to the powerful capability of self-attention mechanism in transformers, researchers develop the vision transformers for a variety of computer vision tasks, such as…

Computer Vision and Pattern Recognition · Computer Science 2022-07-08 Bo-Kai Ruan , Hong-Han Shuai , Wen-Huang Cheng

Image Classification is a fundamental task in the field of computer vision that frequently serves as a benchmark for gauging advancements in Computer Vision. Over the past few years, significant progress has been made in image…

Computer Vision and Pattern Recognition · Computer Science 2023-12-06 Mahmoud Khalil , Ahmad Khalil , Alioune Ngom

Transformers provide promising accuracy and have become popular and used in various domains such as natural language processing and computer vision. However, due to their massive number of model parameters, memory and computation…

Machine Learning · Computer Science 2021-07-01 Hamid Tabani , Ajay Balasubramaniam , Shabbir Marzban , Elahe Arani , Bahram Zonooz

The Transformer architecture has achieved significant success in natural language processing, motivating its adaptation to computer vision tasks. Unlike convolutional neural networks, vision transformers inherently capture long-range…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Zherui Zhang , Rongtao Xu , Jie Zhou , Changwei Wang , Xingtian Pei , Wenhao Xu , Jiguang Zhang , Li Guo , Longxiang Gao , Wenbo Xu , Shibiao Xu

Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation capabilities, researchers are looking at ways to…

Computer Vision and Pattern Recognition · Computer Science 2023-07-11 Kai Han , Yunhe Wang , Hanting Chen , Xinghao Chen , Jianyuan Guo , Zhenhua Liu , Yehui Tang , An Xiao , Chunjing Xu , Yixing Xu , Zhaohui Yang , Yiman Zhang , Dacheng Tao

Transformers have had a significant impact on natural language processing and have recently demonstrated their potential in computer vision. They have shown promising results over convolution neural networks in fundamental computer vision…

Computer Vision and Pattern Recognition · Computer Science 2023-11-14 Rojina Kashefi , Leili Barekatain , Mohammad Sabokrou , Fatemeh Aghaeipoor

We present Reversible Vision Transformers, a memory efficient architecture design for visual recognition. By decoupling the GPU memory requirement from the depth of the model, Reversible Vision Transformers enable scaling up architectures…

Computer Vision and Pattern Recognition · Computer Science 2023-02-10 Karttikeya Mangalam , Haoqi Fan , Yanghao Li , Chao-Yuan Wu , Bo Xiong , Christoph Feichtenhofer , Jitendra Malik

Transformer model architectures have garnered immense interest lately due to their effectiveness across a range of domains like language, vision and reinforcement learning. In the field of natural language processing for example,…

Machine Learning · Computer Science 2022-03-15 Yi Tay , Mostafa Dehghani , Dara Bahri , Donald Metzler

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional…

Recently, transformers have become incredibly popular in computer vision and vision-language tasks. This notable rise in their usage can be primarily attributed to the capabilities offered by attention mechanisms and the outstanding ability…

Computer Vision and Pattern Recognition · Computer Science 2023-12-08 Mayank Vatsa , Anubhooti Jain , Richa Singh

The smoothness of the transformer architecture has been extensively studied in the context of generalization, training stability, and adversarial robustness. However, its role in transfer learning remains poorly understood. In this paper,…

Machine Learning · Computer Science 2026-02-10 Ambroise Odonnat , Laetitia Chapel , Romain Tavenard , Ievgen Redko

Vision transformer has achieved competitive performance on a variety of computer vision applications. However, their storage, run-time memory, and computational demands are hindering the deployment to mobile devices. Here we present a…

Computer Vision and Pattern Recognition · Computer Science 2021-08-17 Mingjian Zhu , Yehui Tang , Kai Han

The success of the transformer architecture in natural language processing has recently triggered attention in the computer vision field. The transformer has been used as a replacement for the widely used convolution operators, due to its…

Computer Vision and Pattern Recognition · Computer Science 2022-08-09 Jean Lahoud , Jiale Cao , Fahad Shahbaz Khan , Hisham Cholakkal , Rao Muhammad Anwer , Salman Khan , Ming-Hsuan Yang

Vision Transformers achieve impressive accuracy across a range of visual recognition tasks. Unfortunately, their accuracy frequently comes with high computational costs. This is a particular issue in video recognition, where models are…

Computer Vision and Pattern Recognition · Computer Science 2023-08-28 Matthew Dutson , Yin Li , Mohit Gupta

State-of-the-art deep learning models for computer vision tasks are based on the transformer architecture and often deployed in real-time applications. In this scenario, the resources available for every inference can vary, so it is useful…

Computer Vision and Pattern Recognition · Computer Science 2024-04-17 Kavya Sreedhar , Jason Clemons , Rangharajan Venkatesan , Stephen W. Keckler , Mark Horowitz

Transformers were initially introduced for natural language processing (NLP) tasks, but fast they were adopted by most deep learning fields, including computer vision. They measure the relationships between pairs of input tokens (words in…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Robin Courant , Maika Edberg , Nicolas Dufour , Vicky Kalogeiton
‹ Prev 1 2 3 10 Next ›