Related papers: Skip-Convolutions for Efficient Video Processing

SkipNet: Learning Dynamic Routing in Convolutional Networks

While deeper convolutional networks are needed to achieve maximum accuracy in visual perception tasks, for many inputs shallower networks are sufficient. We exploit this observation by learning to skip convolutional layers on a per-input…

Computer Vision and Pattern Recognition · Computer Science 2018-07-26 Xin Wang , Fisher Yu , Zi-Yi Dou , Trevor Darrell , Joseph E. Gonzalez

Rethinking Motion Representation: Residual Frames with 3D ConvNets for Better Action Recognition

Recently, 3D convolutional networks yield good performance in action recognition. However, optical flow stream is still needed to ensure better performance, the cost of which is very high. In this paper, we propose a fast but effective way…

Computer Vision and Pattern Recognition · Computer Science 2020-01-17 Li Tao , Xueting Wang , Toshihiko Yamasaki

ResidualViT for Efficient Temporally Dense Video Encoding

Several video understanding tasks, such as natural language temporal video grounding, temporal activity localization, and audio description generation, require "temporally dense" reasoning over frames sampled at high temporal resolution.…

Computer Vision and Pattern Recognition · Computer Science 2025-09-17 Mattia Soldan , Fabian Caba Heilbron , Bernard Ghanem , Josef Sivic , Bryan Russell

ResQ: Residual Quantization for Video Perception

This paper accelerates video perception, such as semantic segmentation and human pose estimation, by levering cross-frame redundancies. Unlike the existing approaches, which avoid redundant computations by warping the past features using…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Davide Abati , Haitam Ben Yahia , Markus Nagel , Amirhossein Habibian

Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

Convolutional neural networks have enabled accurate image super-resolution in real-time. However, recent attempts to benefit from temporal correlations in video super-resolution have been limited to naive or inefficient architectures. In…

Computer Vision and Pattern Recognition · Computer Science 2017-04-11 Jose Caballero , Christian Ledig , Andrew Aitken , Alejandro Acosta , Johannes Totz , Zehan Wang , Wenzhe Shi

Gradient Forward-Propagation for Large-Scale Temporal Video Modelling

How can neural networks be trained on large-volume temporal data efficiently? To compute the gradients required to update parameters, backpropagation blocks computations until the forward and backward passes are completed. For temporal…

Computer Vision and Pattern Recognition · Computer Science 2021-07-13 Mateusz Malinowski , Dimitrios Vytiniotis , Grzegorz Swirszcz , Viorica Patraucean , Joao Carreira

Global Spatial-Temporal Information-based Residual ConvLSTM for Video Space-Time Super-Resolution

By converting low-frame-rate, low-resolution videos into high-frame-rate, high-resolution ones, space-time video super-resolution techniques can enhance visual experiences and facilitate more efficient information dissemination. We propose…

Image and Video Processing · Electrical Eng. & Systems 2024-07-12 Congrui Fu , Hui Yuan , Shiqi Jiang , Guanghui Zhang , Liquan Shen , Raouf Hamzaoui

LeanResNet: A Low-cost Yet Effective Convolutional Residual Networks

Convolutional Neural Networks (CNNs) filter the input data using spatial convolution operators with compact stencils. Commonly, the convolution operators couple features from all channels, which leads to immense computational cost in the…

Machine Learning · Computer Science 2019-05-17 Jonathan Ephrath , Lars Ruthotto , Eldad Haber , Eran Treister

Reusing Convolutional Activations from Frame to Frame to Speed up Training and Inference

When processing similar frames in succession, we can take advantage of the locality of the convolution operation to reevaluate only portions of the image that changed from the previous frame. By saving the output of a layer of convolutions…

Computer Vision and Pattern Recognition · Computer Science 2019-09-17 Arno Khachatourian

Distortion-Aware Network Pruning and Feature Reuse for Real-time Video Segmentation

Real-time video segmentation is a crucial task for many real-world applications such as autonomous driving and robot control. Since state-of-the-art semantic segmentation models are often too heavy for real-time applications despite their…

Computer Vision and Pattern Recognition · Computer Science 2022-12-16 Hyunsu Rhee , Dongchan Min , Sunil Hwang , Bruno Andreis , Sung Ju Hwang

Motion Representation Using Residual Frames with 3D CNN

Recently, 3D convolutional networks (3D ConvNets) yield good performance in action recognition. However, optical flow stream is still needed to ensure better performance, the cost of which is very high. In this paper, we propose a fast but…

Computer Vision and Pattern Recognition · Computer Science 2020-06-24 Li Tao , Xueting Wang , Toshihiko Yamasaki

LeanConvNets: Low-cost Yet Effective Convolutional Neural Networks

Convolutional Neural Networks (CNNs) have become indispensable for solving machine learning tasks in speech recognition, computer vision, and other areas that involve high-dimensional data. A CNN filters the input feature using a network…

Machine Learning · Computer Science 2020-02-13 Jonathan Ephrath , Moshe Eliasof , Lars Ruthotto , Eldad Haber , Eran Treister

Skip-Attention: Improving Vision Transformers by Paying Less Attention

This work aims to improve the efficiency of vision transformers (ViT). While ViTs use computationally expensive self-attention operations in every layer, we identify that these operations are highly correlated across layers -- a key…

Computer Vision and Pattern Recognition · Computer Science 2023-01-18 Shashanka Venkataramanan , Amir Ghodrati , Yuki M. Asano , Fatih Porikli , Amirhossein Habibian

Lightweight Residual Densely Connected Convolutional Neural Network

Extremely efficient convolutional neural network architectures are one of the most important requirements for limited-resource devices (such as embedded and mobile devices). The computing power and memory size are two important constraints…

Computer Vision and Pattern Recognition · Computer Science 2021-03-09 Fahimeh Fooladgar , Shohreh Kasaei

Design Light-weight 3D Convolutional Networks for Video Recognition Temporal Residual, Fully Separable Block, and Fast Algorithm

Deep 3-dimensional (3D) Convolutional Network (ConvNet) has shown promising performance on video recognition tasks because of its powerful spatio-temporal information fusion ability. However, the extremely intensive requirements on memory…

Computer Vision and Pattern Recognition · Computer Science 2019-06-03 Haonan Wang , Jun Lin , Zhongfeng Wang

GSVNet: Guided Spatially-Varying Convolution for Fast Semantic Segmentation on Video

This paper addresses fast semantic segmentation on video.Video segmentation often calls for real-time, or even fasterthan real-time, processing. One common recipe for conserving computation arising from feature extraction is to propagate…

Computer Vision and Pattern Recognition · Computer Science 2021-06-09 Shih-Po Lee , Si-Cun Chen , Wen-Hsiao Peng

Convolutional Gated Recurrent Networks for Video Segmentation

Semantic segmentation has recently witnessed major progress, where fully convolutional neural networks have shown to perform well. However, most of the previous work focused on improving single image segmentation. To our knowledge, no prior…

Computer Vision and Pattern Recognition · Computer Science 2016-11-23 Mennatullah Siam , Sepehr Valipour , Martin Jagersand , Nilanjan Ray

Recurrent Fully Convolutional Networks for Video Segmentation

Image segmentation is an important step in most visual tasks. While convolutional neural networks have shown to perform well on single image segmentation, to our knowledge, no study has been been done on leveraging recurrent gated…

Computer Vision and Pattern Recognition · Computer Science 2016-11-01 Sepehr Valipour , Mennatullah Siam , Martin Jagersand , Nilanjan Ray

ResNet: Enabling Deep Convolutional Neural Networks through Residual Learning

Convolutional Neural Networks (CNNs) has revolutionized computer vision, but training very deep networks has been challenging due to the vanishing gradient problem. This paper explores Residual Networks (ResNet), introduced by He et al.…

Computer Vision and Pattern Recognition · Computer Science 2025-10-29 Xingyu Liu , Kun Ming Goh

Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections

In this paper, we propose a very deep fully convolutional encoding-decoding framework for image restoration such as denoising and super-resolution. The network is composed of multiple layers of convolution and de-convolution operators,…

Computer Vision and Pattern Recognition · Computer Science 2016-09-02 Xiao-Jiao Mao , Chunhua Shen , Yu-Bin Yang