Related papers: Visual Attention Network

Large Separable Kernel Attention: Rethinking the Large Kernel Attention Design in CNN

Visual Attention Networks (VAN) with Large Kernel Attention (LKA) modules have been shown to provide remarkable performance, that surpasses Vision Transformers (ViTs), on a range of vision-based tasks. However, the depth-wise convolutional…

Computer Vision and Pattern Recognition · Computer Science 2023-10-23 Kin Wai Lau , Lai-Man Po , Yasar Abbas Ur Rehman

Beyond Self-Attention: Deformable Large Kernel Attention for Medical Image Segmentation

Medical image segmentation has seen significant improvements with transformer models, which excel in grasping far-reaching contexts and global contextual information. However, the increasing computational demands of these models,…

Computer Vision and Pattern Recognition · Computer Science 2023-09-04 Reza Azad , Leon Niggemeier , Michael Huttemann , Amirhossein Kazerouni , Ehsan Khodapanah Aghdam , Yury Velichko , Ulas Bagci , Dorit Merhof

BEVANet: Bilateral Efficient Visual Attention Network for Real-Time Semantic Segmentation

Real-time semantic segmentation presents the dual challenge of designing efficient architectures that capture large receptive fields for semantic understanding while also refining detailed contours. Vision transformers model long-range…

Computer Vision and Pattern Recognition · Computer Science 2025-08-21 Ping-Mao Huang , I-Tien Chao , Ping-Chia Huang , Jia-Wei Liao , Yung-Yu Chuang

Multi-scale Attention Network for Single Image Super-Resolution

ConvNets can compete with transformers in high-level tasks by exploiting larger receptive fields. To unleash the potential of ConvNet in super-resolution, we propose a multi-scale attention network (MAN), by coupling classical multi-scale…

Image and Video Processing · Electrical Eng. & Systems 2024-04-16 Yan Wang , Yusen Li , Gang Wang , Xiaoguang Liu

Efficient Image Super-Resolution via Symmetric Visual Attention Network

An important development direction in the Single-Image Super-Resolution (SISR) algorithms is to improve the efficiency of the algorithms. Recently, efficient Super-Resolution (SR) research focuses on reducing model complexity and improving…

Computer Vision and Pattern Recognition · Computer Science 2024-01-18 Chengxu Wu , Qinrui Fan , Shu Hu , Xi Wu , Xin Wang , Jing Hu

Vicinity Vision Transformer

Vision transformers have shown great success on numerous computer vision tasks. However, its central component, softmax attention, prohibits vision transformers from scaling up to high-resolution images, due to both the computational…

Computer Vision and Pattern Recognition · Computer Science 2023-07-21 Weixuan Sun , Zhen Qin , Hui Deng , Jianyuan Wang , Yi Zhang , Kaihao Zhang , Nick Barnes , Stan Birchfield , Lingpeng Kong , Yiran Zhong

LKA-ReID:Vehicle Re-Identification with Large Kernel Attention

With the rapid development of intelligent transportation systems and the popularity of smart city infrastructure, Vehicle Re-ID technology has become an important research field. The vehicle Re-ID task faces an important challenge, which is…

Computer Vision and Pattern Recognition · Computer Science 2024-09-27 Xuezhi Xiang , Zhushan Ma , Lei Zhang , Denis Ombati , Himaloy Himu , Xiantong Zhen

Interpreting and Improving Attention From the Perspective of Large Kernel Convolution

Attention mechanisms have significantly advanced visual models by capturing global context effectively. However, their reliance on large-scale datasets and substantial computational resources poses challenges in data-scarce and…

Computer Vision and Pattern Recognition · Computer Science 2024-12-03 Chenghao Li , Chaoning Zhang , Boheng Zeng , Yi Lu , Pengbo Shi , Qingzi Chen , Jirui Liu , Lingyun Zhu , Yang Yang , Heng Tao Shen

Vision Transformer with Super Token Sampling

Vision transformer has achieved impressive performance for many vision tasks. However, it may suffer from high redundancy in capturing local features for shallow layers. Local self-attention or early-stage convolutions are thus utilized,…

Computer Vision and Pattern Recognition · Computer Science 2024-01-26 Huaibo Huang , Xiaoqiang Zhou , Jie Cao , Ran He , Tieniu Tan

Large-Kernel Attention for 3D Medical Image Segmentation

Automatic segmentation of multiple organs and tumors from 3D medical images such as magnetic resonance imaging (MRI) and computed tomography (CT) scans using deep learning methods can aid in diagnosing and treating cancer. However, organs…

Image and Video Processing · Electrical Eng. & Systems 2022-07-25 Hao Li , Yang Nan , Javier Del Ser , Guang Yang

Large coordinate kernel attention network for lightweight image super-resolution

The multi-scale receptive field and large kernel attention (LKA) module have been shown to significantly improve performance in the lightweight image super-resolution task. However, existing lightweight super-resolution (SR) methods seldom…

Image and Video Processing · Electrical Eng. & Systems 2024-09-02 Fangwei Hao , Jiesheng Wu , Haotian Lu , Ji Du , Jing Xu , Xiaoxuan Xu

Lightweight Structure-Aware Attention for Visual Understanding

Attention operator has been widely used as a basic brick in visual understanding since it provides some flexibility through its adjustable kernels. However, this operator suffers from inherent limitations: (1) the attention kernel is not…

Computer Vision and Pattern Recognition · Computer Science 2025-07-04 Heeseung Kwon , Francisco M. Castro , Manuel J. Marin-Jimenez , Nicolas Guil , Karteek Alahari

Hydra Attention: Efficient Attention with Many Heads

While transformers have begun to dominate many tasks in vision, applying them to large images is still computationally difficult. A large reason for this is that self-attention scales quadratically with the number of tokens, which in turn,…

Computer Vision and Pattern Recognition · Computer Science 2022-09-16 Daniel Bolya , Cheng-Yang Fu , Xiaoliang Dai , Peizhao Zhang , Judy Hoffman

KVT: k-NN Attention for Boosting Vision Transformers

Convolutional Neural Networks (CNNs) have dominated computer vision for years, due to its ability in capturing locality and translation invariance. Recently, many vision transformer architectures have been proposed and they show promising…

Computer Vision and Pattern Recognition · Computer Science 2022-07-26 Pichao Wang , Xue Wang , Fan Wang , Ming Lin , Shuning Chang , Hao Li , Rong Jin

From Pixels to Objects: Cubic Visual Attention for Visual Question Answering

Recently, attention-based Visual Question Answering (VQA) has achieved great success by utilizing question to selectively target different visual areas that are related to the answer. Existing visual attention models are generally planar,…

Computer Vision and Pattern Recognition · Computer Science 2022-06-07 Jingkuan Song , Pengpeng Zeng , Lianli Gao , Heng Tao Shen

RegionViT: Regional-to-Local Attention for Vision Transformers

Vision transformer (ViT) has recently shown its strong capability in achieving comparable results to convolutional neural networks (CNNs) on image classification. However, vanilla ViT simply inherits the same architecture from the natural…

Computer Vision and Pattern Recognition · Computer Science 2022-04-01 Chun-Fu Chen , Rameswar Panda , Quanfu Fan

Triple Attention Mixed Link Network for Single Image Super Resolution

Single image super resolution is of great importance as a low-level computer vision task. Recent approaches with deep convolutional neural networks have achieved im-pressive performance. However, existing architectures have limitations due…

Computer Vision and Pattern Recognition · Computer Science 2018-10-09 Xi Cheng , Xiang Li , Jian Yang

Lite Vision Transformer with Enhanced Self-Attention

Despite the impressive representation capacity of vision transformer models, current light-weight vision transformer models still suffer from inconsistent and incorrect dense predictions at local regions. We suspect that the power of their…

Computer Vision and Pattern Recognition · Computer Science 2021-12-22 Chenglin Yang , Yilin Wang , Jianming Zhang , He Zhang , Zijun Wei , Zhe Lin , Alan Yuille

Diversified Visual Attention Networks for Fine-Grained Object Classification

Fine-grained object classification is a challenging task due to the subtle inter-class difference and large intra-class variation. Recently, visual attention models have been applied to automatically localize the discriminative regions of…

Computer Vision and Pattern Recognition · Computer Science 2018-02-27 Bo Zhao , Xiao Wu , Jiashi Feng , Qiang Peng , Shuicheng Yan

The Linear Attention Resurrection in Vision Transformer

Vision Transformers (ViTs) have recently taken computer vision by storm. However, the softmax attention underlying ViTs comes with a quadratic complexity in time and memory, hindering the application of ViTs to high-resolution images. We…

Computer Vision and Pattern Recognition · Computer Science 2025-02-17 Chuanyang Zheng