English
Related papers

Related papers: Linear Context Transform Block

200 papers

In this paper we investigate the amount of spatial context required for channel attention. To this end we study the popular squeeze-and-excite (SE) block which is a simple and lightweight channel attention mechanism. SE blocks and its…

Machine Learning · Statistics 2021-07-06 Niv Vosco , Alon Shenkler , Mark Grobman

Transformer with self-attention has led to the revolutionizing of natural language processing field, and recently inspires the emergence of Transformer-style architecture design with competitive results in numerous computer vision tasks.…

Computer Vision and Pattern Recognition · Computer Science 2021-07-27 Yehao Li , Ting Yao , Yingwei Pan , Tao Mei

The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at…

Computer Vision and Pattern Recognition · Computer Science 2019-05-17 Jie Hu , Li Shen , Samuel Albanie , Gang Sun , Enhua Wu

Attention mechanism is a hot spot in deep learning field. Using channel attention model is an effective method for improving the performance of the convolutional neural network. Squeeze-and-Excitation block takes advantage of the channel…

Machine Learning · Computer Science 2019-01-08 Huayu Li

Learning an effective speaker representation is crucial for achieving reliable performance in speaker verification tasks. Speech signals are high-dimensional, long, and variable-length sequences containing diverse information at each…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-25 Wei Xia , John H. L. Hansen

The aim of this paper is threefold. We inform the AI practitioner about the human visual system with an extensive literature review; we propose a novel biologically motivated neural network for image classification; and, finally, we present…

Computer Vision and Pattern Recognition · Computer Science 2024-09-09 Gianluca Carloni , Sara Colantonio

Recent advances in video generation can produce realistic, minute-long single-shot videos with scalable diffusion transformers. However, real-world narrative videos require multi-shot scenes with visual and dynamic consistency across shots.…

Computer Vision and Pattern Recognition · Computer Science 2025-03-14 Yuwei Guo , Ceyuan Yang , Ziyan Yang , Zhibei Ma , Zhijie Lin , Zhenheng Yang , Dahua Lin , Lu Jiang

Channel and spatial attention mechanisms introduced by earlier works enhance the representation abilities of deep convolutional neural networks (CNNs) but often lead to increased parameter and computation costs. While recent approaches…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Rishabh Sabharwal , Ram Samarth B B , Parikshit Singh Rathore , Punit Rathore

In this work, we propose a generally applicable transformation unit for visual recognition with deep convolutional neural networks. This transformation explicitly models channel relationships with explainable control variables. These…

Computer Vision and Pattern Recognition · Computer Science 2020-03-30 Zongxin Yang , Linchao Zhu , Yu Wu , Yi Yang

Transformer-based Large Language Models (LLMs) have exhibited remarkable success in extensive tasks primarily attributed to self-attention mechanism, which requires a token to consider all preceding tokens as its context to compute…

Computation and Language · Computer Science 2025-08-05 Yaofo Chen , Zeng You , Shuhai Zhang , Haokun Li , Yirui Li , Yaowei Wang , Mingkui Tan

Channel attention mechanisms in convolutional neural networks have been proven to be effective in various computer vision tasks. However, the performance improvement comes with additional model complexity and computation cost. In this…

Computer Vision and Pattern Recognition · Computer Science 2021-12-14 Krushi Patel , Guanghui Wang

In a wide range of semantic segmentation tasks, fully convolutional neural networks (F-CNNs) have been successfully leveraged to achieve state-of-the-art performance. Architectural innovations of F-CNNs have mainly been on improving spatial…

Computer Vision and Pattern Recognition · Computer Science 2018-08-27 Abhijit Guha Roy , Nassir Navab , Christian Wachinger

This paper presents MOAT, a family of neural networks that build on top of MObile convolution (i.e., inverted residual blocks) and ATtention. Unlike the current works that stack separate mobile convolution and transformer blocks, we…

Computer Vision and Pattern Recognition · Computer Science 2023-02-01 Chenglin Yang , Siyuan Qiao , Qihang Yu , Xiaoding Yuan , Yukun Zhu , Alan Yuille , Hartwig Adam , Liang-Chieh Chen

Attention mechanisms have become integral to modern convolutional neural networks (CNNs), delivering notable performance improvements with minimal computational overhead. However, the efficiency accuracy trade off of different channel…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Prem Babu Kanaparthi , Tulasi Venkata Sri Varshini Padamata

Deep learning models are widely used nowadays for their reliability in performing various tasks. However, they do not typically provide the reasoning behind their decision, which is a significant drawback, particularly for more sensitive…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Tiago Roxo , Joana C. Costa , Pedro R. M. Inácio , Hugo Proença

We propose global context vision transformer (GC ViT), a novel architecture that enhances parameter and compute utilization for computer vision. Our method leverages global context self-attention modules, joint with standard local…

Computer Vision and Pattern Recognition · Computer Science 2023-06-07 Ali Hatamizadeh , Hongxu Yin , Greg Heinrich , Jan Kautz , Pavlo Molchanov

Graph-based convolutional model such as non-local block has shown to be effective for strengthening the context modeling ability in convolutional neural networks (CNNs). However, its pixel-wise computational overhead is prohibitive which…

Computer Vision and Pattern Recognition · Computer Science 2021-09-01 Xiangtai Li , Xia Li , Ansheng You , Li Zhang , Guangliang Cheng , Kuiyuan Yang , Yunhai Tong , Zhouchen Lin

Although transformer architectures have achieved state-of-the-art performance across diverse domains, their quadratic computational complexity with respect to sequence length remains a significant bottleneck, particularly for…

Computation and Language · Computer Science 2025-11-05 Zeyu Liu , Souvik Kundu , Lianghao Jiang , Anni Li , Srikanth Ronanki , Sravan Bodapati , Gourav Datta , Peter A. Beerel

This work is an improved system that we submitted to task 1 of DCASE2023 challenge. We propose a method of low-complexity acoustic scene classification by a parallel attention-convolution network which consists of four modules, including…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-13 Yanxiong Li , Jiaxin Tan , Guoqing Chen , Jialong Li , Yongjie Si , Qianhua He

Convolution is the main building block of convolutional neural networks (CNN). We observe that an optimized CNN often has highly correlated filters as the number of channels increases with depth, reducing the expressive power of feature…

Computer Vision and Pattern Recognition · Computer Science 2020-09-28 Xudong Wang , Stella X. Yu
‹ Prev 1 2 3 10 Next ›