Related papers: Linear Context Transform Block

Tiled Squeeze-and-Excite: Channel Attention With Local Spatial Context

In this paper we investigate the amount of spatial context required for channel attention. To this end we study the popular squeeze-and-excite (SE) block which is a simple and lightweight channel attention mechanism. SE blocks and its…

Machine Learning · Statistics 2021-07-06 Niv Vosco , Alon Shenkler , Mark Grobman

Contextual Transformer Networks for Visual Recognition

Transformer with self-attention has led to the revolutionizing of natural language processing field, and recently inspires the emergence of Transformer-style architecture design with competitive results in numerous computer vision tasks.…

Computer Vision and Pattern Recognition · Computer Science 2021-07-27 Yehao Li , Ting Yao , Yingwei Pan , Tao Mei

Squeeze-and-Excitation Networks

The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at…

Computer Vision and Pattern Recognition · Computer Science 2019-05-17 Jie Hu , Li Shen , Samuel Albanie , Gang Sun , Enhua Wu

Channel Locality Block: A Variant of Squeeze-and-Excitation

Attention mechanism is a hot spot in deep learning field. Using channel attention model is an effective method for improving the performance of the convolutional neural network. Squeeze-and-Excitation block takes advantage of the channel…

Machine Learning · Computer Science 2019-01-08 Huayu Li

Attention and DCT based Global Context Modeling for Text-independent Speaker Recognition

Learning an effective speaker representation is crucial for achieving reliable performance in speaker verification tasks. Speech signals are high-dimensional, long, and variable-length sequences containing diverse information at each…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-25 Wei Xia , John H. L. Hansen

Connectivity-Inspired Network for Context-Aware Recognition

The aim of this paper is threefold. We inform the AI practitioner about the human visual system with an extensive literature review; we propose a novel biologically motivated neural network for image classification; and, finally, we present…

Computer Vision and Pattern Recognition · Computer Science 2024-09-09 Gianluca Carloni , Sara Colantonio

Long Context Tuning for Video Generation

Recent advances in video generation can produce realistic, minute-long single-shot videos with scalable diffusion transformers. However, real-world narrative videos require multi-shot scenes with visual and dynamic consistency across shots.…

Computer Vision and Pattern Recognition · Computer Science 2025-03-14 Yuwei Guo , Ceyuan Yang , Ziyan Yang , Zhibei Ma , Zhijie Lin , Zhenheng Yang , Dahua Lin , Lu Jiang

STEAM: Squeeze and Transform Enhanced Attention Module

Channel and spatial attention mechanisms introduced by earlier works enhance the representation abilities of deep convolutional neural networks (CNNs) but often lead to increased parameter and computation costs. While recent approaches…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Rishabh Sabharwal , Ram Samarth B B , Parikshit Singh Rathore , Punit Rathore

Gated Channel Transformation for Visual Recognition

In this work, we propose a generally applicable transformation unit for visual recognition with deep convolutional neural networks. This transformation explicitly models channel relationships with explainable control variables. These…

Computer Vision and Pattern Recognition · Computer Science 2020-03-30 Zongxin Yang , Linchao Zhu , Yu Wu , Yi Yang

Core Context Aware Transformers for Long Context Language Modeling

Transformer-based Large Language Models (LLMs) have exhibited remarkable success in extensive tasks primarily attributed to self-attention mechanism, which requires a token to consider all preceding tokens as its context to compute…

Computation and Language · Computer Science 2025-08-05 Yaofo Chen , Zeng You , Shuhai Zhang , Haokun Li , Yirui Li , Yaowei Wang , Mingkui Tan

A Discriminative Channel Diversification Network for Image Classification

Channel attention mechanisms in convolutional neural networks have been proven to be effective in various computer vision tasks. However, the performance improvement comes with additional model complexity and computation cost. In this…

Computer Vision and Pattern Recognition · Computer Science 2021-12-14 Krushi Patel , Guanghui Wang

Recalibrating Fully Convolutional Networks with Spatial and Channel 'Squeeze & Excitation' Blocks

In a wide range of semantic segmentation tasks, fully convolutional neural networks (F-CNNs) have been successfully leveraged to achieve state-of-the-art performance. Architectural innovations of F-CNNs have mainly been on improving spatial…

Computer Vision and Pattern Recognition · Computer Science 2018-08-27 Abhijit Guha Roy , Nassir Navab , Christian Wachinger

MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models

This paper presents MOAT, a family of neural networks that build on top of MObile convolution (i.e., inverted residual blocks) and ATtention. Unlike the current works that stack separate mobile convolution and transformer blocks, we…

Computer Vision and Pattern Recognition · Computer Science 2023-02-01 Chenglin Yang , Siyuan Qiao , Qihang Yu , Xiaoding Yuan , Yukun Zhu , Alan Yuille , Hartwig Adam , Liang-Chieh Chen

Lightweight Channel Attention for Efficient CNNs

Attention mechanisms have become integral to modern convolutional neural networks (CNNs), delivering notable performance improvements with minimal computational overhead. However, the efficiency accuracy trade off of different channel…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Prem Babu Kanaparthi , Tulasi Venkata Sri Varshini Padamata

How to Squeeze An Explanation Out of Your Model

Deep learning models are widely used nowadays for their reliability in performing various tasks. However, they do not typically provide the reasoning behind their decision, which is a significant drawback, particularly for more sensitive…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Tiago Roxo , Joana C. Costa , Pedro R. M. Inácio , Hugo Proença

Global Context Vision Transformers

We propose global context vision transformer (GC ViT), a novel architecture that enhances parameter and compute utilization for computer vision. Our method leverages global context self-attention modules, joint with standard local…

Computer Vision and Pattern Recognition · Computer Science 2023-06-07 Ali Hatamizadeh , Hongxu Yin , Greg Heinrich , Jan Kautz , Pavlo Molchanov

Towards Efficient Scene Understanding via Squeeze Reasoning

Graph-based convolutional model such as non-local block has shown to be effective for strengthening the context modeling ability in convolutional neural networks (CNNs). However, its pixel-wise computational overhead is prohibitive which…

Computer Vision and Pattern Recognition · Computer Science 2021-09-01 Xiangtai Li , Xia Li , Ansheng You , Li Zhang , Guangliang Cheng , Kuiyuan Yang , Yunhai Tong , Zhouchen Lin

LAWCAT: Efficient Distillation from Quadratic to Linear Attention with Convolution across Tokens for Long Context Modeling

Although transformer architectures have achieved state-of-the-art performance across diverse domains, their quadratic computational complexity with respect to sequence length remains a significant bottleneck, particularly for…

Computation and Language · Computer Science 2025-11-05 Zeyu Liu , Souvik Kundu , Lianghao Jiang , Anni Li , Srikanth Ronanki , Sravan Bodapati , Gourav Datta , Peter A. Beerel

Low-Complexity Acoustic Scene Classification Using Parallel Attention-Convolution Network

This work is an improved system that we submitted to task 1 of DCASE2023 challenge. We propose a method of low-complexity acoustic scene classification by a parallel attention-convolution network which consists of four modules, including…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-13 Yanxiong Li , Jiaxin Tan , Guoqing Chen , Jialong Li , Yongjie Si , Qianhua He

Tied Block Convolution: Leaner and Better CNNs with Shared Thinner Filters

Convolution is the main building block of convolutional neural networks (CNN). We observe that an optimized CNN often has highly correlated filters as the number of channels increases with depth, reducing the expressive power of feature…

Computer Vision and Pattern Recognition · Computer Science 2020-09-28 Xudong Wang , Stella X. Yu