Related papers: Relational Action Forecasting

Actor-Centric Relation Network

Current state-of-the-art approaches for spatio-temporal action localization rely on detections at the frame level and model temporal context with 3D ConvNets. Here, we go one step further and model spatio-temporal relations to capture the…

Computer Vision and Pattern Recognition · Computer Science 2018-07-31 Chen Sun , Abhinav Shrivastava , Carl Vondrick , Kevin Murphy , Rahul Sukthankar , Cordelia Schmid

Hierarchical Graph-RNNs for Action Detection of Multiple Activities

In this paper, we propose an approach that spatially localizes the activities in a video frame where each person can perform multiple activities at the same time. Our approach takes the temporal scene context as well as the relations of the…

Computer Vision and Pattern Recognition · Computer Science 2021-01-22 Sovan Biswas , Yaser Souri , Juergen Gall

MRSN: Multi-Relation Support Network for Video Action Detection

Action detection is a challenging video understanding task, requiring modeling spatio-temporal and interaction relations. Current methods usually model actor-actor and actor-context relations separately, ignoring their complementarity and…

Computer Vision and Pattern Recognition · Computer Science 2023-04-25 Yin-Dong Zheng , Guo Chen , Minglei Yuan , Tong Lu

Am I Done? Predicting Action Progress in Videos

In this paper we deal with the problem of predicting action progress in videos. We argue that this is an extremely important task since it can be valuable for a wide range of interaction applications. To this end we introduce a novel…

Computer Vision and Pattern Recognition · Computer Science 2020-03-11 Federico Becattini , Tiberio Uricchio , Lorenzo Seidenari , Lamberto Ballan , Alberto Del Bimbo

Long Short-Term Relation Networks for Video Action Detection

It has been well recognized that modeling human-object or object-object relations would be helpful for detection task. Nevertheless, the problem is not trivial especially when exploring the interactions between human actor, object and scene…

Computer Vision and Pattern Recognition · Computer Science 2020-04-01 Dong Li , Ting Yao , Zhaofan Qiu , Houqiang Li , Tao Mei

A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation

Action recognition has become a rapidly developing research field within the last decade. But with the increasing demand for large scale data, the need of hand annotated data for the training becomes more and more impractical. One way to…

Computer Vision and Pattern Recognition · Computer Science 2019-06-05 Hilde Kuehne , Alexander Richard , Juergen Gall

Human Action Recognition and Prediction: A Survey

Derived from rapid advances in computer vision and machine learning, video analysis tasks have been moving from inferring the present state to predicting the future state. Vision-based action recognition and prediction from videos are such…

Computer Vision and Pattern Recognition · Computer Science 2022-02-15 Yu Kong , Yun Fu

From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation

The action anticipation task refers to predicting what action will happen based on observed videos, which requires the model to have a strong ability to summarize the present and then reason about the future. Experience and common sense…

Computer Vision and Pattern Recognition · Computer Science 2024-08-07 Xin Liu , Chao Hao , Zitong Yu , Huanjing Yue , Jingyu Yang

Video action detection by learning graph-based spatio-temporal interactions

Action Detection is a complex task that aims to detect and classify human actions in video clips. Typically, it has been addressed by processing fine-grained features extracted from a video classification backbone. Recently, thanks to the…

Computer Vision and Pattern Recognition · Computer Science 2021-03-02 Matteo Tomei , Lorenzo Baraldi , Simone Calderara , Simone Bronzin , Rita Cucchiara

Identity-aware Graph Memory Network for Action Detection

Action detection plays an important role in high-level video understanding and media interpretation. Many existing studies fulfill this spatio-temporal localization by modeling the context, capturing the relationship of actors, objects, and…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Jingcheng Ni , Jie Qin , Di Huang

Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization

Localizing persons and recognizing their actions from videos is a challenging task towards high-level video understanding. Recent advances have been achieved by modeling direct pairwise relations between entities. In this paper, we take one…

Computer Vision and Pattern Recognition · Computer Science 2021-04-22 Junting Pan , Siyu Chen , Mike Zheng Shou , Yu Liu , Jing Shao , Hongsheng Li

Temporal Recurrent Networks for Online Action Detection

Most work on temporal action detection is formulated as an offline problem, in which the start and end times of actions are determined after the entire video is fully observed. However, important real-time applications including…

Computer Vision and Pattern Recognition · Computer Science 2019-03-26 Mingze Xu , Mingfei Gao , Yi-Ting Chen , Larry S. Davis , David J. Crandall

Action-Agnostic Human Pose Forecasting

Predicting and forecasting human dynamics is a very interesting but challenging task with several prospective applications in robotics, health-care, etc. Recently, several methods have been developed for human pose forecasting; however,…

Computer Vision and Pattern Recognition · Computer Science 2018-10-24 Hsu-kuang Chiu , Ehsan Adeli , Borui Wang , De-An Huang , Juan Carlos Niebles

Learning to Forecast Videos of Human Activity with Multi-granularity Models and Adaptive Rendering

We propose an approach for forecasting video of complex human activity involving multiple people. Direct pixel-level prediction is too simple to handle the appearance variability in complex activities. Hence, we develop novel intermediate…

Computer Vision and Pattern Recognition · Computer Science 2017-12-07 Mengyao Zhai , Jiacheng Chen , Ruizhi Deng , Lei Chen , Ligeng Zhu , Greg Mori

Video Action Transformer Network

We introduce the Action Transformer model for recognizing and localizing human actions in video clips. We repurpose a Transformer-style architecture to aggregate features from the spatiotemporal context around the person whose actions we…

Computer Vision and Pattern Recognition · Computer Science 2019-05-20 Rohit Girdhar , João Carreira , Carl Doersch , Andrew Zisserman

Predicting Human Interaction via Relative Attention Model

Predicting human interaction is challenging as the on-going activity has to be inferred based on a partially observed video. Essentially, a good algorithm should effectively model the mutual influence between the two interacting subjects.…

Computer Vision and Pattern Recognition · Computer Science 2017-05-29 Yichao Yan , Bingbing Ni , Xiaokang Yang

When will you do what? - Anticipating Temporal Occurrences of Activities

Analyzing human actions in videos has gained increased attention recently. While most works focus on classifying and labeling observed video frames or anticipating the very recent future, making long-term predictions over more than just a…

Computer Vision and Pattern Recognition · Computer Science 2018-04-04 Yazan Abu Farha , Alexander Richard , Juergen Gall

Multi-Level Recurrent Residual Networks for Action Recognition

Most existing Convolutional Neural Networks(CNNs) used for action recognition are either difficult to optimize or underuse crucial temporal information. Inspired by the fact that the recurrent model consistently makes breakthroughs in the…

Computer Vision and Pattern Recognition · Computer Science 2018-01-04 Zhenxing Zheng , Gaoyun An , Qiuqi Ruan

Recurrent Residual Learning for Action Recognition

Action recognition is a fundamental problem in computer vision with a lot of potential applications such as video surveillance, human computer interaction, and robot learning. Given pre-segmented videos, the task is to recognize actions…

Computer Vision and Pattern Recognition · Computer Science 2017-06-28 Ahsan Iqbal , Alexander Richard , Hilde Kuehne , Juergen Gall

Action Recognition using Visual Attention

We propose a soft attention based model for the task of action recognition in videos. We use multi-layered Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units which are deep both spatially and temporally. Our model…

Machine Learning · Computer Science 2016-02-16 Shikhar Sharma , Ryan Kiros , Ruslan Salakhutdinov