Related papers: Deformable 3D Convolution for Video Super-Resoluti…

3DSRnet: Video Super-resolution using 3D Convolutional Neural Networks

In video super-resolution, the spatio-temporal coherence between, and among the frames must be exploited appropriately for accurate prediction of the high resolution frames. Although 2D convolutional neural networks (CNNs) are powerful in…

Computer Vision and Pattern Recognition · Computer Science 2019-06-21 Soo Ye Kim , Jeongyeon Lim , Taeyoung Na , Munchurl Kim

Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

Convolutional neural networks have enabled accurate image super-resolution in real-time. However, recent attempts to benefit from temporal correlations in video super-resolution have been limited to naive or inefficient architectures. In…

Computer Vision and Pattern Recognition · Computer Science 2017-04-11 Jose Caballero , Christian Ledig , Andrew Aitken , Alejandro Acosta , Johannes Totz , Zehan Wang , Wenzhe Shi

Spatial-Spectral Residual Network for Hyperspectral Image Super-Resolution

Deep learning-based hyperspectral image super-resolution (SR) methods have achieved great success recently. However, most existing models can not effectively explore spatial information and spectral information between bands simultaneously,…

Computer Vision and Pattern Recognition · Computer Science 2020-01-15 Qi Wang , Qiang Li , Xuelong Li

Spatio-Temporal FAST 3D Convolutions for Human Action Recognition

Effective processing of video input is essential for the recognition of temporally varying events such as human actions. Motivated by the often distinctive temporal characteristics of actions in either horizontal or vertical direction, we…

Computer Vision and Pattern Recognition · Computer Science 2020-06-24 Alexandros Stergiou , Ronald Poppe

Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks

Convolutional Neural Networks (CNN) have been regarded as a powerful class of models for image recognition problems. Nevertheless, it is not trivial when utilizing a CNN for learning spatio-temporal video representation. A few studies have…

Computer Vision and Pattern Recognition · Computer Science 2017-11-29 Zhaofan Qiu , Ting Yao , Tao Mei

D^2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos

Despite receiving significant attention from the research community, the task of segmenting and tracking objects in monocular videos still has much room for improvement. Existing works have simultaneously justified the efficacy of dilated…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Christian Schmidt , Ali Athar , Sabarinath Mahadevan , Bastian Leibe

Fast Spatio-Temporal Residual Network for Video Super-Resolution

Recently, deep learning based video super-resolution (SR) methods have achieved promising performance. To simultaneously exploit the spatial and temporal information of videos, employing 3-dimensional (3D) convolutions is a natural…

Computer Vision and Pattern Recognition · Computer Science 2019-04-08 Sheng Li , Fengxiang He , Bo Du , Lefei Zhang , Yonghao Xu , Dacheng Tao

Deformable Kernel Convolutional Network for Video Extreme Super-Resolution

Video super-resolution, which attempts to reconstruct high-resolution video frames from their corresponding low-resolution versions, has received increasingly more attention in recent years. Most existing approaches opt to use deformable…

Computer Vision and Pattern Recognition · Computer Science 2020-10-02 Xuan Xu , Xin Xiong , Jinge Wang , Xin Li

StNet: Local and Global Spatial-Temporal Modeling for Action Recognition

Despite the success of deep learning for static image understanding, it remains unclear what are the most effective network architectures for the spatial-temporal modeling in videos. In this paper, in contrast to the existing CNN+RNN or…

Computer Vision and Pattern Recognition · Computer Science 2018-12-12 Dongliang He , Zhichao Zhou , Chuang Gan , Fu Li , Xiao Liu , Yandong Li , Limin Wang , Shilei Wen

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

In many robotics and VR/AR applications, 3D-videos are readily-available sources of input (a continuous sequence of depth images, or LIDAR scans). However, those 3D-videos are processed frame-by-frame either through 2D convnets or 3D…

Computer Vision and Pattern Recognition · Computer Science 2019-06-17 Christopher Choy , JunYoung Gwak , Silvio Savarese

Learning Spatiotemporal Features with 3D Convolutional Networks

We propose a simple, yet effective approach for spatiotemporal feature learning using deep 3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised video dataset. Our findings are three-fold: 1) 3D ConvNets are…

Computer Vision and Pattern Recognition · Computer Science 2015-10-08 Du Tran , Lubomir Bourdev , Rob Fergus , Lorenzo Torresani , Manohar Paluri

Video Saliency Detection by 3D Convolutional Neural Networks

Different from salient object detection methods for still images, a key challenging for video saliency detection is how to extract and combine spatial and temporal features. In this paper, we present a novel and effective approach for…

Computer Vision and Pattern Recognition · Computer Science 2018-07-13 Guanqun Ding , Yuming Fang

Global Spatial-Temporal Information-based Residual ConvLSTM for Video Space-Time Super-Resolution

By converting low-frame-rate, low-resolution videos into high-frame-rate, high-resolution ones, space-time video super-resolution techniques can enhance visual experiences and facilitate more efficient information dissemination. We propose…

Image and Video Processing · Electrical Eng. & Systems 2024-07-12 Congrui Fu , Hui Yuan , Shiqi Jiang , Guanghui Zhang , Liquan Shen , Raouf Hamzaoui

Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

The work in this paper is driven by the question how to exploit the temporal cues available in videos for their accurate classification, and for human action recognition in particular? Thus far, the vision community has focused on…

Computer Vision and Pattern Recognition · Computer Science 2017-11-23 Ali Diba , Mohsen Fayyaz , Vivek Sharma , Amir Hossein Karami , Mohammad Mahdi Arzani , Rahman Yousefzadeh , Luc Van Gool

Spatiotemporal Dilated Convolution with Uncertain Matching for Video-based Crowd Estimation

In this paper, we propose a novel SpatioTemporal convolutional Dense Network (STDNet) to address the video-based crowd counting problem, which contains the decomposition of 3D convolution and the 3D spatiotemporal dilated dense convolution…

Computer Vision and Pattern Recognition · Computer Science 2021-02-01 Yu-Jen Ma , Hong-Han Shuai , Wen-Huang Cheng

An Efficient 3D Convolutional Neural Network with Channel-wise, Spatial-grouped, and Temporal Convolutions

There has been huge progress on video action recognition in recent years. However, many works focus on tweaking existing 2D backbones due to the reliance of ImageNet pretraining, which restrains the models from achieving higher efficiency…

Computer Vision and Pattern Recognition · Computer Science 2025-03-05 Zhe Wang , Xulei Yang

Deformable Non-local Network for Video Super-Resolution

The video super-resolution (VSR) task aims to restore a high-resolution (HR) video frame by using its corresponding low-resolution (LR) frame and multiple neighboring frames. At present, many deep learning-based VSR methods rely on optical…

Image and Video Processing · Electrical Eng. & Systems 2019-12-24 Hua Wang , Dewei Su , Chuangchuang Liu , Longcun Jin , Xianfang Sun , Xinyi Peng

Using 3D Convolutional Neural Networks to Learn Spatiotemporal Features for Automatic Surgical Gesture Recognition in Video

Automatically recognizing surgical gestures is a crucial step towards a thorough understanding of surgical skill. Possible areas of application include automatic skill assessment, intra-operative monitoring of critical surgical steps, and…

Computer Vision and Pattern Recognition · Computer Science 2019-07-29 Isabel Funke , Sebastian Bodenstedt , Florian Oehme , Felix von Bechtolsheim , Jürgen Weitz , Stefanie Speidel

Collaborative Spatio-temporal Feature Learning for Video Action Recognition

Spatio-temporal feature learning is of central importance for action recognition in videos. Existing deep neural network models either learn spatial and temporal features independently (C2D) or jointly with unconstrained parameters (C3D).…

Computer Vision and Pattern Recognition · Computer Science 2019-03-05 Chao Li , Qiaoyong Zhong , Di Xie , Shiliang Pu

A Closer Look at Spatiotemporal Convolutions for Action Recognition

In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition. Our motivation stems from the observation that 2D CNNs applied to individual frames of the video have…

Computer Vision and Pattern Recognition · Computer Science 2018-04-13 Du Tran , Heng Wang , Lorenzo Torresani , Jamie Ray , Yann LeCun , Manohar Paluri