Related papers: Mobile Video Action Recognition

Efficient Action Detection in Untrimmed Videos via Multi-Task Learning

This paper studies the joint learning of action recognition and temporal localization in long, untrimmed videos. We employ a multi-task learning framework that performs the three highly related steps of action proposal, action recognition,…

Computer Vision and Pattern Recognition · Computer Science 2017-04-05 Yi Zhu , Shawn Newsam

Activity Recognition on a Large Scale in Short Videos - Moments in Time Dataset

Moments capture a huge part of our lives. Accurate recognition of these moments is challenging due to the diverse and complex interpretation of the moments. Action recognition refers to the act of classifying the desired action/activity…

Computer Vision and Pattern Recognition · Computer Science 2018-09-14 Ankit Shah , Harini Kesavamoorthy , Poorva Rane , Pramati Kalwad , Alexander Hauptmann , Florian Metze

Flatten: Video Action Recognition is an Image Classification task

In recent years, video action recognition, as a fundamental task in the field of video understanding, has been deeply explored by numerous researchers.Most traditional video action recognition methods typically involve converting videos…

Computer Vision and Pattern Recognition · Computer Science 2024-08-20 Junlin Chen , Chengcheng Xu , Yangfan Xu , Jian Yang , Jun Li , Zhiping Shi

Action Recognition Using Temporal Shift Module and Ensemble Learning

This paper presents the first-rank solution for the Multi-Modal Action Recognition Challenge, part of the Multi-Modal Visual Pattern Recognition Workshop at the \acl{ICPR} 2024. The competition aimed to recognize human actions using a…

Computer Vision and Pattern Recognition · Computer Science 2025-02-11 Anh-Kiet Duong , Petra Gomez-Krämer

MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection

Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos. The temporal relation is complex in those datasets, including challenges like composite action, and co-occurring action.…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Rui Dai , Srijan Das , Kumara Kahatapitiya , Michael S. Ryoo , Francois Bremond

More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation

Current state-of-the-art models for video action recognition are mostly based on expensive 3D ConvNets. This results in a need for large GPU clusters to train and evaluate such architectures. To address this problem, we present a…

Computer Vision and Pattern Recognition · Computer Science 2021-07-27 Quanfu Fan , Chun-Fu Chen , Hilde Kuehne , Marco Pistoia , David Cox

Motion-driven Visual Tempo Learning for Video-based Action Recognition

Action visual tempo characterizes the dynamics and the temporal scale of an action, which is helpful to distinguish human actions that share high similarities in visual dynamics and appearance. Previous methods capture the visual tempo…

Computer Vision and Pattern Recognition · Computer Science 2022-07-13 Yuanzhong Liu , Junsong Yuan , Zhigang Tu

M2A: Motion Aware Attention for Accurate Video Action Recognition

Advancements in attention mechanisms have led to significant performance improvements in a variety of areas in machine learning due to its ability to enable the dynamic modeling of temporal sequences. A particular area in computer vision…

Computer Vision and Pattern Recognition · Computer Science 2021-12-14 Brennan Gebotys , Alexander Wong , David A. Clausi

Rank Pooling for Action Recognition

We propose a function-based temporal pooling method that captures the latent structure of the video sequence data - e.g. how frame-level features evolve over time in a video. We show how the parameters of a function that has been fit to the…

Computer Vision and Pattern Recognition · Computer Science 2016-05-17 Basura Fernando , Efstratios Gavves , Jose Oramas , Amir Ghodrati , Tinne Tuytelaars

A Comprehensive Study of Deep Video Action Recognition

Video action recognition is one of the representative tasks for video understanding. Over the last decade, we have witnessed great advancements in video action recognition thanks to the emergence of deep learning. But we also encountered…

Computer Vision and Pattern Recognition · Computer Science 2020-12-14 Yi Zhu , Xinyu Li , Chunhui Liu , Mohammadreza Zolfaghari , Yuanjun Xiong , Chongruo Wu , Zhi Zhang , Joseph Tighe , R. Manmatha , Mu Li

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles…

Computer Vision and Pattern Recognition · Computer Science 2016-08-03 Limin Wang , Yuanjun Xiong , Zhe Wang , Yu Qiao , Dahua Lin , Xiaoou Tang , Luc Van Gool

Exploring Stronger Feature for Temporal Action Localization

Temporal action localization aims to localize starting and ending time with action category. Limited by GPU memory, mainstream methods pre-extract features for each video. Therefore, feature quality determines the upper bound of detection…

Computer Vision and Pattern Recognition · Computer Science 2021-06-25 Zhiwu Qing , Xiang Wang , Ziyuan Huang , Yutong Feng , Shiwei Zhang , jianwen Jiang , Mingqian Tang , Changxin Gao , Nong Sang

A Real-time Action Representation with Temporal Encoding and Deep Compression

Deep neural networks have achieved remarkable success for video-based action recognition. However, most of existing approaches cannot be deployed in practice due to the high computational cost. To address this challenge, we propose a new…

Computer Vision and Pattern Recognition · Computer Science 2020-06-18 Kun Liu , Wu Liu , Huadong Ma , Mingkui Tan , Chuang Gan

Temporal Segment Networks for Action Recognition in Videos

Deep convolutional networks have achieved great success for image recognition. However, for action recognition in videos, their advantage over traditional methods is not so evident. We present a general and flexible video-level framework…

Computer Vision and Pattern Recognition · Computer Science 2017-05-09 Limin Wang , Yuanjun Xiong , Zhe Wang , Yu Qiao , Dahua Lin , Xiaoou Tang , Luc Van Gool

An Efficient 3D Convolutional Neural Network with Channel-wise, Spatial-grouped, and Temporal Convolutions

There has been huge progress on video action recognition in recent years. However, many works focus on tweaking existing 2D backbones due to the reliance of ImageNet pretraining, which restrains the models from achieving higher efficiency…

Computer Vision and Pattern Recognition · Computer Science 2025-03-05 Zhe Wang , Xulei Yang

The Solution for Temporal Action Localisation Task of Perception Test Challenge 2024

This report presents our method for Temporal Action Localisation (TAL), which focuses on identifying and classifying actions within specific time intervals throughout a video sequence. We employ a data augmentation technique by expanding…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Yinan Han , Qingyuan Jiang , Hongming Mei , Yang Yang , Jinhui Tang

TDN: Temporal Difference Networks for Efficient Action Recognition

Temporal modeling still remains challenging for action recognition in videos. To mitigate this issue, this paper presents a new video architecture, termed as Temporal Difference Network (TDN), with a focus on capturing multi-scale temporal…

Computer Vision and Pattern Recognition · Computer Science 2021-04-02 Limin Wang , Zhan Tong , Bin Ji , Gangshan Wu

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Conventionally, spatiotemporal modeling network and its complexity are the two most concentrated research topics in video action recognition. Existing state-of-the-art methods have achieved excellent accuracy regardless of the complexity…

Computer Vision and Pattern Recognition · Computer Science 2021-01-06 Wenhao Wu , Dongliang He , Tianwei Lin , Fu Li , Chuang Gan , Errui Ding

Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization

State-of-the-art temporal action detectors inefficiently search the entire video for specific actions. Despite the encouraging progress these methods achieve, it is crucial to design automated approaches that only explore parts of the video…

Computer Vision and Pattern Recognition · Computer Science 2018-07-30 Humam Alwassel , Fabian Caba Heilbron , Bernard Ghanem

AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos

We propose a novel method for temporally pooling frames in a video for the task of human action recognition. The method is motivated by the observation that there are only a small number of frames which, together, contain sufficient…

Computer Vision and Pattern Recognition · Computer Science 2017-06-27 Amlan Kar , Nishant Rai , Karan Sikka , Gaurav Sharma