English
Related papers

Related papers: RapidNet: Multi-Level Dilated Convolution Based Mo…

200 papers

Recently, lightweight Vision Transformers (ViTs) demonstrate superior performance and lower latency, compared with lightweight Convolutional Neural Networks (CNNs), on resource-constrained mobile devices. Researchers have discovered many…

Computer Vision and Pattern Recognition · Computer Science 2024-03-15 Ao Wang , Hui Chen , Zijia Lin , Jungong Han , Guiguang Ding

Light-weight convolutional neural networks (CNNs) are the de-facto for mobile vision tasks. Their spatial inductive biases allow them to learn representations with fewer parameters across different vision tasks. However, these networks are…

Computer Vision and Pattern Recognition · Computer Science 2022-03-07 Sachin Mehta , Mohammad Rastegari

Vision-transformers (ViTs) and large-scale convolution-neural-networks (CNNs) have reshaped computer vision through pretrained feature representations that enable strong transfer learning for diverse tasks. However, their efficiency as…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Alon Kaya , Igal Bilik , Inna Stainvas

Over the past few decades, convolutional neural networks (CNNs) have been at the forefront of the detection and tracking of various retinal diseases (RD). Despite their success, the emergence of vision transformers (ViT) in the 2020s has…

Image and Video Processing · Electrical Eng. & Systems 2024-04-17 Wenhui Zhu , Peijie Qiu , Xiwen Chen , Xin Li , Natasha Lepore , Oana M. Dumitrascu , Yalin Wang

With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to optimize the performance and complexity of ViTs to enable efficient deployment on mobile devices. Multiple approaches are proposed to accelerate…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Yanyu Li , Ju Hu , Yang Wen , Georgios Evangelidis , Kamyar Salahi , Yanzhi Wang , Sergey Tulyakov , Jian Ren

The hybrid deep models of Vision Transformer (ViT) and Convolution Neural Network (CNN) have emerged as a powerful class of backbones for vision tasks. Scaling up the input resolution of such hybrid backbones naturally strengthes model…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Ting Yao , Yehao Li , Yingwei Pan , Tao Mei

Traditionally, convolutional neural networks (CNN) and vision transformers (ViT) have dominated computer vision. However, recently proposed vision graph neural networks (ViG) provide a new avenue for exploration. Unfortunately, for mobile…

Computer Vision and Pattern Recognition · Computer Science 2023-07-04 Mustafa Munir , William Avery , Radu Marculescu

Vision Transformers (ViT) have shown rapid progress in computer vision tasks, achieving promising results on various benchmarks. However, due to the massive number of parameters and model design, \textit{e.g.}, attention mechanism,…

Computer Vision and Pattern Recognition · Computer Science 2022-10-12 Yanyu Li , Geng Yuan , Yang Wen , Ju Hu , Georgios Evangelidis , Sergey Tulyakov , Yanzhi Wang , Jian Ren

Although convolutional networks (ConvNets) have enjoyed great success in computer vision (CV), it suffers from capturing global information crucial to dense prediction tasks such as object detection and segmentation. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2021-05-12 Haotian Yan , Zhe Li , Weijian Li , Changhu Wang , Ming Wu , Chuang Zhang

In the realm of resource-constrained mobile vision tasks, the pursuit of efficiency and performance consistently drives innovation in lightweight Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). While ViTs excel at…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Mingshu Zhao , Yi Luo , Yong Ouyang

The transformer model has gained widespread adoption in computer vision tasks in recent times. However, due to the quadratic time and memory complexity of self-attention, which is proportional to the number of input tokens, most existing…

Computer Vision and Pattern Recognition · Computer Science 2023-11-13 Wei Tan , Yifeng Geng , Xuansong Xie

Due to the complex attention mechanisms and model design, most existing vision Transformers (ViTs) can not perform as efficiently as convolutional neural networks (CNNs) in realistic industrial deployment scenarios, e.g. TensorRT and…

Computer Vision and Pattern Recognition · Computer Science 2022-08-17 Jiashi Li , Xin Xia , Wei Li , Huixia Li , Xing Wang , Xuefeng Xiao , Rui Wang , Min Zheng , Xin Pan

Self-attention based models such as vision transformers (ViTs) have emerged as a very competitive architecture alternative to convolutional neural networks (CNNs) in computer vision. Despite increasingly stronger variants with ever-higher…

Computer Vision and Pattern Recognition · Computer Science 2022-07-25 Junting Pan , Adrian Bulat , Fuwen Tan , Xiatian Zhu , Lukasz Dudziak , Hongsheng Li , Georgios Tzimiropoulos , Brais Martinez

Vision Transformers (ViT) have recently emerged as a powerful alternative to convolutional networks (CNNs). Although hybrid models attempt to bridge the gap between these two architectures, the self-attention layers they rely on induce a…

Machine Learning · Computer Science 2021-06-11 Stéphane d'Ascoli , Levent Sagun , Giulio Biroli , Ari Morcos

With the increasing popularity and the increasing size of vision transformers (ViTs), there has been an increasing interest in making them more efficient and less computationally costly for deployment on edge devices with limited computing…

Computer Vision and Pattern Recognition · Computer Science 2023-07-04 Phuoc-Hoan Charles Le , Xinlin Li

Neural Architecture Search (NAS) has shown promising performance in the automatic design of vision transformers (ViT) exceeding 1G FLOPs. However, designing lightweight and low-latency ViT models for diverse mobile devices remains a big…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Chen Tang , Li Lyna Zhang , Huiqiang Jiang , Jiahang Xu , Ting Cao , Quanlu Zhang , Yuqing Yang , Zhi Wang , Mao Yang

Vision Transformer (ViT) demonstrates that Transformer for natural language processing can be applied to computer vision tasks and result in comparable performance to convolutional neural networks (CNN), which have been studied and adopted…

Computer Vision and Pattern Recognition · Computer Science 2021-09-03 Yi-Lun Liao , Sertac Karaman , Vivienne Sze

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference. 3D convolutional neural networks (CNNs) are accurate at video recognition but…

Computer Vision and Pattern Recognition · Computer Science 2021-04-20 Dan Kondratyuk , Liangzhe Yuan , Yandong Li , Li Zhang , Mingxing Tan , Matthew Brown , Boqing Gong

Convolutional Neural Networks (CNNs) have advanced existing medical systems for automatic disease diagnosis. However, there are still concerns about the reliability of deep medical diagnosis systems against the potential threats of…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Omid Nejati Manzari , Hamid Ahmadabadi , Hossein Kashiani , Shahriar B. Shokouhi , Ahmad Ayatollahi

Vision transformers have been successfully applied to image recognition tasks due to their ability to capture long-range dependencies within an image. However, there are still gaps in both performance and computational cost between…

Computer Vision and Pattern Recognition · Computer Science 2022-06-15 Jianyuan Guo , Kai Han , Han Wu , Yehui Tang , Xinghao Chen , Yunhe Wang , Chang Xu
‹ Prev 1 2 3 10 Next ›