English
Related papers

Related papers: Blockwise Temporal-Spatial Pathway Network

200 papers

Despite the success of deep learning for static image understanding, it remains unclear what are the most effective network architectures for the spatial-temporal modeling in videos. In this paper, in contrast to the existing CNN+RNN or…

Computer Vision and Pattern Recognition · Computer Science 2018-12-12 Dongliang He , Zhichao Zhou , Chuang Gan , Fu Li , Xiao Liu , Yandong Li , Limin Wang , Shilei Wen

Deep convolutional networks have achieved great success for image recognition. However, for action recognition in videos, their advantage over traditional methods is not so evident. We present a general and flexible video-level framework…

Computer Vision and Pattern Recognition · Computer Science 2017-05-09 Limin Wang , Yuanjun Xiong , Zhe Wang , Yu Qiao , Dahua Lin , Xiaoou Tang , Luc Van Gool

Temporal modeling in videos is a fundamental yet challenging problem in computer vision. In this paper, we propose a novel Temporal Bilinear (TB) model to capture the temporal pairwise feature interactions between adjacent frames. Compared…

Computer Vision and Pattern Recognition · Computer Science 2018-11-27 Yanghao Li , Sijie Song , Yuqi Li , Jiaying Liu

Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles…

Computer Vision and Pattern Recognition · Computer Science 2016-08-03 Limin Wang , Yuanjun Xiong , Zhe Wang , Yu Qiao , Dahua Lin , Xiaoou Tang , Luc Van Gool

Skeleton-based action recognition has become popular in recent years due to its efficiency and robustness. Most current methods adopt graph convolutional network (GCN) for topology modeling, but GCN-based methods are limited in…

Computer Vision and Pattern Recognition · Computer Science 2023-02-28 Jinzhao Luo , Lu Zhou , Guibo Zhu , Guojing Ge , Beiying Yang , Jinqiao Wang

Effective processing of video input is essential for the recognition of temporally varying events such as human actions. Motivated by the often distinctive temporal characteristics of actions in either horizontal or vertical direction, we…

Computer Vision and Pattern Recognition · Computer Science 2020-06-24 Alexandros Stergiou , Ronald Poppe

Effective extraction of temporal patterns is crucial for the recognition of temporally varying actions in video. We argue that the fixed-sized spatio-temporal convolution kernels used in convolutional neural networks (CNNs) can be improved…

Computer Vision and Pattern Recognition · Computer Science 2021-04-01 Alexandros Stergiou , Ronald Poppe

With the rapid development of digital multimedia, video understanding has become an important field. For action recognition, temporal dimension plays an important role, and this is quite different from image recognition. In order to learn…

Computer Vision and Pattern Recognition · Computer Science 2020-02-11 Qian Liu , Tao Wang , Jie Liu , Yang Guan , Qi Bu , Longfei Yang

Convolutional neural networks with spatio-temporal 3D kernels (3D CNNs) have an ability to directly extract spatio-temporal features from videos for action recognition. Although the 3D kernels tend to overfit because of a large number of…

Computer Vision and Pattern Recognition · Computer Science 2017-08-28 Kensho Hara , Hirokatsu Kataoka , Yutaka Satoh

The work in this paper is driven by the question how to exploit the temporal cues available in videos for their accurate classification, and for human action recognition in particular? Thus far, the vision community has focused on…

Computer Vision and Pattern Recognition · Computer Science 2017-11-23 Ali Diba , Mohsen Fayyaz , Vivek Sharma , Amir Hossein Karami , Mohammad Mahdi Arzani , Rahman Yousefzadeh , Luc Van Gool

Skeleton-based action recognition has attracted considerable attention due to its compact representation of the human body's skeletal sructure. Many recent methods have achieved remarkable performance using graph convolutional networks…

Computer Vision and Pattern Recognition · Computer Science 2023-07-20 Jungho Lee , Minhyeok Lee , Suhwan Cho , Sungmin Woo , Sungjun Jang , Sangyoun Lee

The work in this paper is driven by the question if spatio-temporal correlations are enough for 3D convolutional neural networks (CNN)? Most of the traditional 3D networks use local spatio-temporal features. We introduce a new block that…

Computer Vision and Pattern Recognition · Computer Science 2019-02-08 Ali Diba , Mohsen Fayyaz , Vivek Sharma , M. Mahdi Arzani , Rahman Yousefzadeh , Juergen Gall , Luc Van Gool

In recent years, 2D Convolutional Networks-based video action recognition has encouragingly gained wide popularity; However, constrained by the lack of long-range non-linear temporal relation modeling and reverse motion information…

Computer Vision and Pattern Recognition · Computer Science 2021-12-20 Yongkang Zhang , Jun Li , Guoming Wu , Han Zhang , Zhiping Shi , Zhaoxun Liu , Zizhang Wu

There is significant progress in recognizing traditional human activities from videos focusing on highly distinctive actions involving discriminative body movements, body-object and/or human-human interactions. Driver's activities are…

Computer Vision and Pattern Recognition · Computer Science 2021-01-19 Zachary Wharton , Ardhendu Behera , Yonghuai Liu , Nik Bessis

The video based CNN works have focused on effective ways to fuse appearance and motion networks, but they typically lack utilizing temporal information over video frames. In this work, we present a novel spatio-temporal fusion network…

Computer Vision and Pattern Recognition · Computer Science 2019-06-18 Sangwoo Cho , Hassan Foroosh

Spatiotemporal and motion features are two complementary and crucial information for video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn spatiotemporal features and another flow stream to learn motion…

Computer Vision and Pattern Recognition · Computer Science 2019-08-19 Boyuan Jiang , Mengmeng Wang , Weihao Gan , Wei Wu , Junjie Yan

3D convolutional neural networks have achieved promising results for video tasks in computer vision, including video saliency prediction that is explored in this paper. However, 3D convolution encodes visual representation merely on fixed…

Computer Vision and Pattern Recognition · Computer Science 2022-01-19 Ziqiang Wang , Zhi Liu , Gongyang Li , Yang Wang , Tianhong Zhang , Lihua Xu , Jijun Wang

Graph convolutional networks (GCNs) have been widely used and achieved remarkable results in skeleton-based action recognition. We think the key to skeleton-based action recognition is a skeleton hanging in frames, so we focus on how the…

Computer Vision and Pattern Recognition · Computer Science 2023-12-07 Nguyen Huu Bao Long

In this paper, we address the challenges in unsupervised video object segmentation (UVOS) by proposing an efficient algorithm, termed MTNet, which concurrently exploits motion and temporal cues. Unlike previous methods that focus solely on…

Computer Vision and Pattern Recognition · Computer Science 2025-01-15 Yunzhi Zhuge , Hongyu Gu , Lu Zhang , Jinqing Qi , Huchuan Lu

Previous work on long-term video action recognition relies on deep 3D-convolutional models that have a large temporal receptive field (RF). We argue that these models are not always the best choice for temporal modeling in videos. A large…

Computer Vision and Pattern Recognition · Computer Science 2023-08-23 Ombretta Strafforello , Xin Liu , Klamer Schutte , Jan van Gemert
‹ Prev 1 2 3 10 Next ›