Related papers: TVConv: Efficient Translation Variant Convolution …

Dynamic Region-Aware Convolution

We propose a new convolution called Dynamic Region-Aware Convolution (DRConv), which can automatically assign multiple filters to corresponding spatial regions where features have similar representation. In this way, DRConv outperforms…

Computer Vision and Pattern Recognition · Computer Science 2021-03-16 Jin Chen , Xijun Wang , Zichao Guo , Xiangyu Zhang , Jian Sun

Temporally-Adaptive Models for Efficient Video Understanding

Spatial convolutions are extensively used in numerous deep video models. It fundamentally assumes spatio-temporal invariance, i.e., using shared weights for every location in different frames. This work presents Temporally-Adaptive…

Computer Vision and Pattern Recognition · Computer Science 2023-08-14 Ziyuan Huang , Shiwei Zhang , Liang Pan , Zhiwu Qing , Yingya Zhang , Ziwei Liu , Marcelo H. Ang

TAda! Temporally-Adaptive Convolutions for Video Understanding

Spatial convolutions are widely used in numerous deep video models. It fundamentally assumes spatio-temporal invariance, i.e., using shared weights for every location in different frames. This work presents Temporally-Adaptive Convolutions…

Computer Vision and Pattern Recognition · Computer Science 2022-03-18 Ziyuan Huang , Shiwei Zhang , Liang Pan , Zhiwu Qing , Mingqian Tang , Ziwei Liu , Marcelo H. Ang

LightViT: Towards Light-Weight Convolution-Free Vision Transformers

Vision transformers (ViTs) are usually considered to be less light-weight than convolutional neural networks (CNNs) due to the lack of inductive bias. Recent works thus resort to convolutions as a plug-and-play module and embed them in…

Computer Vision and Pattern Recognition · Computer Science 2022-07-13 Tao Huang , Lang Huang , Shan You , Fei Wang , Chen Qian , Chang Xu

CompConv: A Compact Convolution Module for Efficient Feature Learning

Convolutional Neural Networks (CNNs) have achieved remarkable success in various computer vision tasks but rely on tremendous computational cost. To solve this problem, existing approaches either compress well-trained large-scale models or…

Computer Vision and Pattern Recognition · Computer Science 2021-07-06 Chen Zhang , Yinghao Xu , Yujun Shen

MUXConv: Information Multiplexing in Convolutional Neural Networks

Convolutional neural networks have witnessed remarkable improvements in computational efficiency in recent years. A key driving force has been the idea of trading-off model expressivity and efficiency through a combination of $1\times 1$…

Computer Vision and Pattern Recognition · Computer Science 2020-04-08 Zhichao Lu , Kalyanmoy Deb , Vishnu Naresh Boddeti

Design Light-weight 3D Convolutional Networks for Video Recognition Temporal Residual, Fully Separable Block, and Fast Algorithm

Deep 3-dimensional (3D) Convolutional Network (ConvNet) has shown promising performance on video recognition tasks because of its powerful spatio-temporal information fusion ability. However, the extremely intensive requirements on memory…

Computer Vision and Pattern Recognition · Computer Science 2019-06-03 Haonan Wang , Jun Lin , Zhongfeng Wang

LAConv: Local Adaptive Convolution for Image Fusion

The convolution operation is a powerful tool for feature extraction and plays a prominent role in the field of computer vision. However, when targeting the pixel-wise tasks like image fusion, it would not fully perceive the particularity of…

Computer Vision and Pattern Recognition · Computer Science 2021-07-27 Zi-Rong Jin , Liang-Jian Deng , Tai-Xiang Jiang , Tian-Jing Zhang

A Lightweight Convolution and Vision Transformer integrated model with Multi-scale Self-attention Mechanism

Vision Transformer (ViT) has prevailed in computer vision tasks due to its strong long-range dependency modelling ability. \textcolor{blue}{However, its large model size and weak local feature modeling ability hinder its application in real…

Computer Vision and Pattern Recognition · Computer Science 2025-09-12 Yi Zhang , Lingxiao Wei , Bowei Zhang , Ziwei Liu , Kai Yi , Shu Hu

CageViT: Convolutional Activation Guided Efficient Vision Transformer

Recently, Transformers have emerged as the go-to architecture for both vision and language modeling tasks, but their computational efficiency is limited by the length of the input sequence. To address this, several efficient variants of…

Computer Vision and Pattern Recognition · Computer Science 2023-05-18 Hao Zheng , Jinbao Wang , Xiantong Zhen , Hong Chen , Jingkuan Song , Feng Zheng

Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition

This work introduces pyramidal convolution (PyConv), which is capable of processing the input at multiple filter scales. PyConv contains a pyramid of kernels, where each level involves different types of filters with varying size and depth,…

Computer Vision and Pattern Recognition · Computer Science 2020-06-23 Ionut Cosmin Duta , Li Liu , Fan Zhu , Ling Shao

Efficient Training of Visual Transformers with Small Datasets

Visual Transformers (VTs) are emerging as an architectural paradigm alternative to Convolutional networks (CNNs). Differently from CNNs, VTs can capture global relations between image elements and they potentially have a larger…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Yahui Liu , Enver Sangineto , Wei Bi , Nicu Sebe , Bruno Lepri , Marco De Nadai

SlimConv: Reducing Channel Redundancy in Convolutional Neural Networks by Weights Flipping

The channel redundancy in feature maps of convolutional neural networks (CNNs) results in the large consumption of memories and computational resources. In this work, we design a novel Slim Convolution (SlimConv) module to boost the…

Computer Vision and Pattern Recognition · Computer Science 2021-09-08 Jiaxiong Qiu , Cai Chen , Shuaicheng Liu , Bing Zeng

Towards Language-guided Visual Recognition via Dynamic Convolutions

In this paper, we are committed to establishing an unified and end-to-end multi-modal network via exploring the language-guided visual recognition. To approach this target, we first propose a novel multi-modal convolution module called…

Computer Vision and Pattern Recognition · Computer Science 2023-09-15 Gen Luo , Yiyi Zhou , Xiaoshuai Sun , Yongjian Wu , Yue Gao , Rongrong Ji

TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for Medical Image Segmentation

The hybrid architecture of convolution neural networks (CNN) and Transformer has been the most popular method for medical image segmentation. However, the existing networks based on the hybrid architecture suffer from two problems. First,…

Image and Video Processing · Electrical Eng. & Systems 2023-12-21 Rui Sun , Tao Lei , Weichuan Zhang , Yong Wan , Yong Xia , Asoke K. Nandi

Dynamic Convolution: Attention over Convolution Kernels

Light-weight convolutional neural networks (CNNs) suffer performance degradation as their low computational budgets constrain both the depth (number of convolution layers) and the width (number of channels) of CNNs, resulting in limited…

Computer Vision and Pattern Recognition · Computer Science 2020-04-02 Yinpeng Chen , Xiyang Dai , Mengchen Liu , Dongdong Chen , Lu Yuan , Zicheng Liu

ConvMAE: Masked Convolution Meets Masked Autoencoders

Vision Transformers (ViT) become widely-adopted architectures for various vision tasks. Masked auto-encoding for feature pretraining and multi-scale hybrid convolution-transformer architectures can further unleash the potentials of ViT,…

Computer Vision and Pattern Recognition · Computer Science 2022-05-20 Peng Gao , Teli Ma , Hongsheng Li , Ziyi Lin , Jifeng Dai , Yu Qiao

RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations

Recent advances in vision transformers (ViTs) have demonstrated the advantage of global modeling capabilities, prompting widespread integration of large-kernel convolutions for enlarging the effective receptive field (ERF). However, the…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Mingshu Zhao , Yi Luo , Yong Ouyang

Efficient Real-world Image Super-Resolution Via Adaptive Directional Gradient Convolution

Real-SR endeavors to produce high-resolution images with rich details while mitigating the impact of multiple degradation factors. Although existing methods have achieved impressive achievements in detail recovery, they still fall short…

Image and Video Processing · Electrical Eng. & Systems 2024-05-14 Long Peng , Yang Cao , Renjing Pei , Wenbo Li , Jiaming Guo , Xueyang Fu , Yang Wang , Zheng-Jun Zha

FMDConv: Fast Multi-Attention Dynamic Convolution via Speed-Accuracy Trade-off

Spatial convolution is fundamental in constructing deep Convolutional Neural Networks (CNNs) for visual recognition. While dynamic convolution enhances model accuracy by adaptively combining static kernels, it incurs significant…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Tianyu Zhang , Fan Wan , Haoran Duan , Kevin W. Tong , Jingjing Deng , Yang Long