English
Related papers

Related papers: Optimizing Vision Transformers with Data-Free Know…

200 papers

In Natural Language Processing (NLP), Transformers have already revolutionized the field by utilizing an attention-based encoder-decoder model. Recently, some pioneering works have employed Transformer-like architectures in Computer Vision…

Computer Vision and Pattern Recognition · Computer Science 2024-02-13 Gousia Habib , Tausifa Jan Saleem , Brejesh Lall

Vision Transformer (ViT) has emerged as a prominent architecture for various computer vision tasks. In ViT, we divide the input image into patch tokens and process them through a stack of self attention blocks. However, unlike Convolutional…

Computer Vision and Pattern Recognition · Computer Science 2024-04-04 Harsh Rangwani , Pradipto Mondal , Mayank Mishra , Ashish Ramayee Asokan , R. Venkatesh Babu

Vision Transformers (ViTs) are becoming more popular and dominating technique for various vision tasks, compare to Convolutional Neural Networks (CNNs). As a demanding technique in computer vision, ViTs have been successfully solved various…

Computer Vision and Pattern Recognition · Computer Science 2023-10-18 Khawar Islam

In this paper, we tackle a new problem: how to transfer knowledge from the pre-trained cumbersome yet well-performed CNN-based model to learn a compact Vision Transformer (ViT)-based model while maintaining its learning capacity? Due to the…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Xu Zheng , Yunhao Luo , Pengyuan Zhou , Lin Wang

In the past few years, transformers have achieved promising performances on various computer vision tasks. Unfortunately, the immense inference overhead of most existing vision transformers withholds their from being deployed on edge…

Computer Vision and Pattern Recognition · Computer Science 2022-06-03 Zhiwei Hao , Jianyuan Guo , Ding Jia , Kai Han , Yehui Tang , Chao Zhang , Han Hu , Yunhe Wang

Vision transformers (ViTs) have gained popularity recently. Even without customized image operators such as convolutions, ViTs can yield competitive performance when properly trained on massive data. However, the computational overhead of…

Machine Learning · Computer Science 2022-03-17 Shixing Yu , Tianlong Chen , Jiayi Shen , Huan Yuan , Jianchao Tan , Sen Yang , Ji Liu , Zhangyang Wang

Vision Transformers (ViTs) have achieved significant advancement in computer vision tasks due to their powerful modeling capacity. However, their performance notably degrades when trained with insufficient data due to lack of inherent…

Image and Video Processing · Electrical Eng. & Systems 2025-03-04 Omar S. EL-Assiouti , Ghada Hamed , Dina Khattab , Hala M. Ebied

Self-supervised learning has been widely applied to train high-quality vision transformers. Unleashing their excellent performance on memory and compute constraint devices is therefore an important research topic. However, how to distill…

Computer Vision and Pattern Recognition · Computer Science 2022-10-04 Kai Wang , Fei Yang , Joost van de Weijer

Assessing the forensic value of hand images involves the use of unique features and patterns present in an individual's hand. The human hand has distinct characteristics, such as the pattern of veins, fingerprints, and the geometry of the…

Computer Vision and Pattern Recognition · Computer Science 2024-08-21 Thanh Thi Nguyen , Campbell Wilson , Janis Dalins

In the recent past, several domain generalization (DG) methods have been proposed, showing encouraging performance, however, almost all of them build on convolutional neural networks (CNNs). There is little to no progress on studying the DG…

Computer Vision and Pattern Recognition · Computer Science 2022-10-06 Maryam Sultana , Muzammal Naseer , Muhammad Haris Khan , Salman Khan , Fahad Shahbaz Khan

Convolutional neural networks (CNNs) have so far been the de-facto model for visual data. Recent work has shown that (Vision) Transformer models (ViT) can achieve comparable or even superior performance on image classification tasks. This…

Computer Vision and Pattern Recognition · Computer Science 2022-03-07 Maithra Raghu , Thomas Unterthiner , Simon Kornblith , Chiyuan Zhang , Alexey Dosovitskiy

Vision transformers (ViTs) achieve remarkable performance on large datasets, but tend to perform worse than convolutional neural networks (CNNs) when trained from scratch on smaller datasets, possibly due to a lack of local inductive bias…

Computer Vision and Pattern Recognition · Computer Science 2023-05-16 Ibrahim Batuhan Akkaya , Senthilkumar S. Kathiresan , Elahe Arani , Bahram Zonooz

This paper presents a study on improving human action recognition through the utilization of knowledge distillation, and the combination of CNN and ViT models. The research aims to enhance the performance and efficiency of smaller student…

Computer Vision and Pattern Recognition · Computer Science 2023-11-03 Hamid Ahmadabadi , Omid Nejati Manzari , Ahmad Ayatollahi

This paper discusses four facets of the Knowledge Distillation (KD) process for Convolutional Neural Networks (CNNs) and Vision Transformer (ViT) architectures, particularly when executed on edge devices with constrained processing…

Computer Vision and Pattern Recognition · Computer Science 2024-07-19 John Violos , Symeon Papadopoulos , Ioannis Kompatsiaris

Vision Transformer (ViT) architectures are becoming increasingly popular and widely employed to tackle computer vision applications. Their main feature is the capacity to extract global information through the self-attention mechanism,…

Computer Vision and Pattern Recognition · Computer Science 2024-05-06 Lorenzo Papa , Paolo Russo , Irene Amerini , Luping Zhou

Vision transformers (ViTs) have been successfully applied in image classification tasks recently. In this paper, we show that, unlike convolution neural networks (CNNs)that can be improved by stacking more convolutional layers, the…

Computer Vision and Pattern Recognition · Computer Science 2021-04-20 Daquan Zhou , Bingyi Kang , Xiaojie Jin , Linjie Yang , Xiaochen Lian , Zihang Jiang , Qibin Hou , Jiashi Feng

While feature-based knowledge distillation has proven highly effective for compressing CNNs, these techniques unexpectedly fail when applied to Vision Transformers (ViTs), often performing worse than simple logit-based distillation. We…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Huiyuan Tian , Bonan Xu , Shijian Li

Learning efficient and expressive visual representation has long been the pursuit of computer vision research. While Vision Transformers (ViTs) gradually replace traditional Convolutional Neural Networks (CNNs) as more scalable vision…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Quan Kong , Yanru Xiao , Yuhao Shen , Cong Wang

Transformer design is the de facto standard for natural language processing tasks. The success of the transformer design in natural language processing has lately piqued the interest of researchers in the domain of computer vision. When…

Computer Vision and Pattern Recognition · Computer Science 2024-02-29 Md Sohag Mia , Abu Bakor Hayat Arnob , Abdu Naim , Abdullah Al Bary Voban , Md Shariful Islam

Computational Pathology (CPATH) systems have the potential to automate diagnostic tasks. However, the artifacts on the digitized histological glass slides, known as Whole Slide Images (WSIs), may hamper the overall performance of CPATH…

Computer Vision and Pattern Recognition · Computer Science 2023-05-30 Neel Kanwal , Trygve Eftestol , Farbod Khoraminia , Tahlita CM Zuiverloon , Kjersti Engan
‹ Prev 1 2 3 10 Next ›