Related papers: Predicting Action Tubes

Online Spatiotemporal Action Detection and Prediction via Causal Representations

In this thesis, we focus on video action understanding problems from an online and real-time processing point of view. We start with the conversion of the traditional offline spatiotemporal action detection pipeline into an online…

Computer Vision and Pattern Recognition · Computer Science 2020-09-01 Gurkirt Singh

AMTnet: Action-Micro-Tube Regression by End-to-end Trainable Deep Architecture

Dominant approaches to action detection can only provide sub-optimal solutions to the problem, as they rely on seeking frame-level detections, to later compose them into "action tubes" in a post-processing step. With this paper we radically…

Computer Vision and Pattern Recognition · Computer Science 2017-08-08 Suman Saha , Gurkirt Singh , Fabio Cuzzolin

TVNet: Temporal Voting Network for Action Localization

We propose a Temporal Voting Network (TVNet) for action localization in untrimmed videos. This incorporates a novel Voting Evidence Module to locate temporal boundaries, more accurately, where temporal contextual evidence is accumulated to…

Computer Vision and Pattern Recognition · Computer Science 2022-01-04 Hanyuan Wang , Dima Damen , Majid Mirmehdi , Toby Perrett

TTPP: Temporal Transformer with Progressive Prediction for Efficient Action Anticipation

Video action anticipation aims to predict future action categories from observed frames. Current state-of-the-art approaches mainly resort to recurrent neural networks to encode history information into hidden states, and predict future…

Computer Vision and Pattern Recognition · Computer Science 2020-03-10 Wen Wang , Xiaojiang Peng , Yanzhou Su , Yu Qiao , Jian Cheng

Am I Done? Predicting Action Progress in Videos

In this paper we deal with the problem of predicting action progress in videos. We argue that this is an extremely important task since it can be valuable for a wide range of interaction applications. To this end we introduce a novel…

Computer Vision and Pattern Recognition · Computer Science 2020-03-11 Federico Becattini , Tiberio Uricchio , Lorenzo Seidenari , Lamberto Ballan , Alberto Del Bimbo

Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos

Deep learning has been demonstrated to achieve excellent results for image classification and object detection. However, the impact of deep learning on video analysis (e.g. action detection and recognition) has been limited due to…

Computer Vision and Pattern Recognition · Computer Science 2017-08-03 Rui Hou , Chen Chen , Mubarak Shah

Temporally smooth online action detection using cycle-consistent future anticipation

Many video understanding tasks work in the offline setting by assuming that the input video is given from the start to the end. However, many real-world problems require the online setting, making a decision immediately using only the…

Computer Vision and Pattern Recognition · Computer Science 2021-04-19 Young Hwi Kim , Seonghyeon Nam , Seon Joo Kim

TubeR: Tubelet Transformer for Video Action Detection

We propose TubeR: a simple solution for spatio-temporal video action detection. Different from existing methods that depend on either an off-line actor detector or hand-designed actor-positional hypotheses like proposals or anchors, we…

Computer Vision and Pattern Recognition · Computer Science 2022-05-11 Jiaojiao Zhao , Yanyi Zhang , Xinyu Li , Hao Chen , Shuai Bing , Mingze Xu , Chunhui Liu , Kaustav Kundu , Yuanjun Xiong , Davide Modolo , Ivan Marsic , Cees G. M. Snoek , Joseph Tighe

Incremental Tube Construction for Human Action Detection

Current state-of-the-art action detection systems are tailored for offline batch-processing applications. However, for online applications like human-robot interaction, current systems fall short, either because they only detect one action…

Computer Vision and Pattern Recognition · Computer Science 2018-07-25 Harkirat Singh Behl , Michael Sapienza , Gurkirt Singh , Suman Saha , Fabio Cuzzolin , Philip H. S. Torr

Discovering Spatio-Temporal Action Tubes

In this paper, we address the challenging problem of spatial and temporal action detection in videos. We first develop an effective approach to localize frame-level action regions through integrating static and kinematic information by the…

Computer Vision and Pattern Recognition · Computer Science 2018-11-30 Yuancheng Ye , Xiaodong Yang , Yingli Tian

TrackNet: Simultaneous Object Detection and Tracking and Its Application in Traffic Video Analysis

Object detection and object tracking are usually treated as two separate processes. Significant progress has been made for object detection in 2D images using deep learning networks. The usual tracking-by-detection pipeline for object…

Computer Vision and Pattern Recognition · Computer Science 2019-02-06 Chenge Li , Gregory Dobler , Xin Feng , Yao Wang

Tube-CNN: Modeling temporal evolution of appearance for object detection in video

Object detection in video is crucial for many applications. Compared to images, video provides additional cues which can help to disambiguate the detection problem. Our goal in this paper is to learn discriminative models for the temporal…

Computer Vision and Pattern Recognition · Computer Science 2018-12-07 Tuan-Hung Vu , Anton Osokin , Ivan Laptev

When will you do what? - Anticipating Temporal Occurrences of Activities

Analyzing human actions in videos has gained increased attention recently. While most works focus on classifying and labeling observed video frames or anticipating the very recent future, making long-term predictions over more than just a…

Computer Vision and Pattern Recognition · Computer Science 2018-04-04 Yazan Abu Farha , Alexander Richard , Juergen Gall

Finding Action Tubes

We address the problem of action detection in videos. Driven by the latest progress in object detection from 2D images, we build action models using rich feature hierarchies derived from shape and kinematic cues. We incorporate appearance…

Computer Vision and Pattern Recognition · Computer Science 2014-11-25 Georgia Gkioxari , Jitendra Malik

Online Real-time Multiple Spatiotemporal Action Localisation and Prediction

We present a deep-learning framework for real-time multiple spatio-temporal (S/T) action localisation, classification and early prediction. Current state-of-the-art approaches work offline and are too slow to be useful in real- world…

Computer Vision and Pattern Recognition · Computer Science 2017-08-25 Gurkirt Singh , Suman Saha , Michael Sapienza , Philip Torr , Fabio Cuzzolin

Two-Stream AMTnet for Action Detection

In this paper, we propose Two-Stream AMTnet, which leverages recent advances in video-based action representation[1] and incremental action tube generation[2]. Majority of the present action detectors follow a frame-based representation, a…

Computer Vision and Pattern Recognition · Computer Science 2020-04-06 Suman Saha , Gurkirt Singh , Fabio Cuzzolin

Action Tubelet Detector for Spatio-Temporal Action Localization

Current state-of-the-art approaches for spatio-temporal action localization rely on detections at the frame level that are then linked or tracked across time. In this paper, we leverage the temporal continuity of videos instead of operating…

Computer Vision and Pattern Recognition · Computer Science 2017-08-22 Vicky Kalogeiton , Philippe Weinzaepfel , Vittorio Ferrari , Cordelia Schmid

Generic Tubelet Proposals for Action Localization

We develop a novel framework for action localization in videos. We propose the Tube Proposal Network (TPN), which can generate generic, class-independent, video-level tubelet proposals in videos. The generated tubelet proposals can be…

Computer Vision and Pattern Recognition · Computer Science 2017-06-01 Jiawei He , Mostafa S. Ibrahim , Zhiwei Deng , Greg Mori

Detecting Parts for Action Localization

In this paper, we propose a new framework for action localization that tracks people in videos and extracts full-body human tubes, i.e., spatio-temporal regions localizing actions, even in the case of occlusions or truncations. This is…

Computer Vision and Pattern Recognition · Computer Science 2017-07-25 Nicolas Chesneau , Grégory Rogez , Karteek Alahari , Cordelia Schmid

VideoCapsuleNet: A Simplified Network for Action Detection

The recent advances in Deep Convolutional Neural Networks (DCNNs) have shown extremely good results for video human action classification, however, action detection is still a challenging problem. The current action detection approaches…

Computer Vision and Pattern Recognition · Computer Science 2018-05-22 Kevin Duarte , Yogesh S Rawat , Mubarak Shah