Related papers: Precise Temporal Action Localization by Evolving T…
Temporal action localization plays an important role in video analysis, which aims to localize and classify actions in untrimmed videos. The previous methods often predict actions on a feature space of a single-temporal scale. However, the…
Temporal action localization is an important step towards video understanding. Most current action localization methods depend on untrimmed videos with full temporal annotations of action instances. However, it is expensive and…
We address temporal action localization in untrimmed long videos. This is important because videos in real applications are usually unconstrained and contain multiple action instances plus video content of background scenes or other…
State-of-the-art temporal action detectors inefficiently search the entire video for specific actions. Despite the encouraging progress these methods achieve, it is crucial to design automated approaches that only explore parts of the video…
Temporal action localization has recently attracted significant interest in the Computer Vision community. However, despite the great progress, it is hard to identify which aspects of the proposed methods contribute most to the increase in…
Online temporal action localization from an untrimmed video stream is a challenging problem in computer vision. It is challenging because of i) in an untrimmed video stream, more than one action instance may appear, including background…
We address the problem of temporal action localization in videos. We pose action localization as a structured prediction over arbitrary-length temporal windows, where each window is scored as the sum of frame-wise classification scores.…
Temporal action localization in untrimmed videos is an important but difficult task. Difficulties are encountered in the application of existing methods when modeling temporal structures of videos. In the present study, we developed a novel…
This technical report analyzes a temporal action localization method we used in the HACS competition which is hosted in Activitynet Challenge 2020.The goal of our task is to locate the start time and end time of the action in the untrimmed…
This paper studies the joint learning of action recognition and temporal localization in long, untrimmed videos. We employ a multi-task learning framework that performs the three highly related steps of action proposal, action recognition,…
Temporal action localization is an important and challenging task that aims to locate temporal regions in real-world untrimmed videos where actions occur and recognize their classes. It is widely acknowledged that video context is a…
Detecting activities in untrimmed videos is an important but challenging task. The performance of existing methods remains unsatisfactory, e.g., they often meet difficulties in locating the beginning and end of a long complex action. In…
This technical report presents an overview of our solution used in the submission to ActivityNet Challenge 2020 Task 1 (\textbf{temporal action localization/detection}). Temporal action localization requires to not only precisely locate the…
We present a Temporal Context Network (TCN) for precise temporal localization of human activities. Similar to the Faster-RCNN architecture, proposals are placed at equal intervals in a video which span multiple temporal scales. We propose a…
Temporal action localization is a recently-emerging task, aiming to localize video segments from untrimmed videos that contain specific actions. Despite the remarkable recent progress, most two-stage action localization methods still suffer…
We propose a weakly supervised temporal action localization algorithm on untrimmed videos using convolutional neural networks. Our algorithm learns from video-level class labels and predicts temporal intervals of human actions with no…
We present a novel framework, Action Progression Network (APN), for temporal action detection (TAD) in videos. The framework locates actions in videos by detecting the action evolution process. To encode the action evolution, we quantify a…
We propose TAL-Net, an improved approach to temporal action localization in video that is inspired by the Faster R-CNN object detection framework. TAL-Net addresses three key shortcomings of existing approaches: (1) we improve receptive…
Video action detectors are usually trained using datasets with fully-supervised temporal annotations. Building such datasets is an expensive task. To alleviate this problem, recent methods have tried to leverage weak labeling, where videos…
Accurate temporal action proposals play an important role in detecting actions from untrimmed videos. The existing approaches have difficulties in capturing global contextual information and simultaneously localizing actions with different…