Related papers: Actor-Centric Relation Network

Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization

Localizing persons and recognizing their actions from videos is a challenging task towards high-level video understanding. Recent advances have been achieved by modeling direct pairwise relations between entities. In this paper, we take one…

Computer Vision and Pattern Recognition · Computer Science 2021-04-22 Junting Pan , Siyu Chen , Mike Zheng Shou , Yu Liu , Jing Shao , Hongsheng Li

MRSN: Multi-Relation Support Network for Video Action Detection

Action detection is a challenging video understanding task, requiring modeling spatio-temporal and interaction relations. Current methods usually model actor-actor and actor-context relations separately, ignoring their complementarity and…

Computer Vision and Pattern Recognition · Computer Science 2023-04-25 Yin-Dong Zheng , Guo Chen , Minglei Yuan , Tong Lu

Relational Action Forecasting

This paper focuses on multi-person action forecasting in videos. More precisely, given a history of H previous frames, the goal is to detect actors and to predict their future actions for the next T frames. Our approach jointly models…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Chen Sun , Abhinav Shrivastava , Carl Vondrick , Rahul Sukthankar , Kevin Murphy , Cordelia Schmid

Learning Actor Relation Graphs for Group Activity Recognition

Modeling relation between actors is important for recognizing group activity in a multi-person scene. This paper aims at learning discriminative relation between actors efficiently using deep models. To this end, we propose to build a…

Computer Vision and Pattern Recognition · Computer Science 2019-04-24 Jianchao Wu , Limin Wang , Li Wang , Jie Guo , Gangshan Wu

Hierarchical Graph-RNNs for Action Detection of Multiple Activities

In this paper, we propose an approach that spatially localizes the activities in a video frame where each person can perform multiple activities at the same time. Our approach takes the temporal scene context as well as the relations of the…

Computer Vision and Pattern Recognition · Computer Science 2021-01-22 Sovan Biswas , Yaser Souri , Juergen Gall

CTRN: Class-Temporal Relational Network for Action Detection

Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos. There are many real-world challenges in those datasets, such as composite action, co-occurring action, and high temporal…

Computer Vision and Pattern Recognition · Computer Science 2022-07-12 Rui Dai , Srijan Das , Francois Bremond

Identity-aware Graph Memory Network for Action Detection

Action detection plays an important role in high-level video understanding and media interpretation. Many existing studies fulfill this spatio-temporal localization by modeling the context, capturing the relationship of actors, objects, and…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Jingcheng Ni , Jie Qin , Di Huang

Attention-Oriented Action Recognition for Real-Time Human-Robot Interaction

Despite the notable progress made in action recognition tasks, not much work has been done in action recognition specifically for human-robot interaction. In this paper, we deeply explore the characteristics of the action recognition task…

Computer Vision and Pattern Recognition · Computer Science 2020-07-03 Ziyang Song , Ziyi Yin , Zejian Yuan , Chong Zhang , Wanchao Chi , Yonggen Ling , Shenghao Zhang

Pose-Based Two-Stream Relational Networks for Action Recognition in Videos

Recently, pose-based action recognition has gained more and more attention due to the better performance compared with traditional appearance-based methods. However, there still exist two problems to be further solved. First, existing…

Computer Vision and Pattern Recognition · Computer Science 2018-05-23 Wei Wang , Jinjin Zhang , Chenyang Si , Liang Wang

Spatio-Temporal Context for Action Detection

Research in action detection has grown in the recentyears, as it plays a key role in video understanding. Modelling the interactions (either spatial or temporal) between actors and their context has proven to be essential for this task.…

Computer Vision and Pattern Recognition · Computer Science 2021-06-30 Manuel Sarmiento Calderó , David Varas , Elisenda Bou-Balust

Video action detection by learning graph-based spatio-temporal interactions

Action Detection is a complex task that aims to detect and classify human actions in video clips. Typically, it has been addressed by processing fine-grained features extracted from a video classification backbone. Recently, thanks to the…

Computer Vision and Pattern Recognition · Computer Science 2021-03-02 Matteo Tomei , Lorenzo Baraldi , Simone Calderara , Simone Bronzin , Rita Cucchiara

Temporal Convolutional Networks: A Unified Approach to Action Segmentation

The dominant paradigm for video-based action segmentation is composed of two steps: first, for each frame, compute low-level features using Dense Trajectories or a Convolutional Neural Network that encode spatiotemporal information locally,…

Computer Vision and Pattern Recognition · Computer Science 2016-08-31 Colin Lea , Rene Vidal , Austin Reiter , Gregory D. Hager

Multi-Level Recurrent Residual Networks for Action Recognition

Most existing Convolutional Neural Networks(CNNs) used for action recognition are either difficult to optimize or underuse crucial temporal information. Inspired by the fact that the recurrent model consistently makes breakthroughs in the…

Computer Vision and Pattern Recognition · Computer Science 2018-01-04 Zhenxing Zheng , Gaoyun An , Qiuqi Ruan

Actor-centered Representations for Action Localization in Streaming Videos

Event perception tasks such as recognizing and localizing actions in streaming videos are essential for scaling to real-world application contexts. We tackle the problem of learning actor-centered representations through the notion of…

Computer Vision and Pattern Recognition · Computer Science 2022-12-01 Sathyanarayanan N. Aakur , Sudeep Sarkar

Actor Conditioned Attention Maps for Video Action Detection

While observing complex events with multiple actors, humans do not assess each actor separately, but infer from the context. The surrounding context provides essential information for understanding actions. To this end, we propose to…

Computer Vision and Pattern Recognition · Computer Science 2020-05-12 Oytun Ulutan , Swati Rallapalli , Mudhakar Srivatsa , Carlos Torres , B. S. Manjunath

Video Action Transformer Network

We introduce the Action Transformer model for recognizing and localizing human actions in video clips. We repurpose a Transformer-style architecture to aggregate features from the spatiotemporal context around the person whose actions we…

Computer Vision and Pattern Recognition · Computer Science 2019-05-20 Rohit Girdhar , João Carreira , Carl Doersch , Andrew Zisserman

Learning Higher-order Object Interactions for Keypoint-based Video Understanding

Action recognition is an important problem that requires identifying actions in video by learning complex interactions across scene actors and objects. However, modern deep-learning based networks often require significant computation, and…

Computer Vision and Pattern Recognition · Computer Science 2023-05-17 Yi Huang , Asim Kadav , Farley Lai , Deep Patel , Hans Peter Graf

Improvement of Human-Object Interaction Action Recognition Using Scene Information and Multi-Task Learning Approach

Recent graph convolutional neural networks (GCNs) have shown high performance in the field of human action recognition by using human skeleton poses. However, it fails to detect human-object interaction cases successfully due to the lack of…

Computer Vision and Pattern Recognition · Computer Science 2025-09-18 Hesham M. Shehata , Mohammad Abdolrahmani

Efficient Modelling Across Time of Human Actions and Interactions

This thesis focuses on video understanding for human action and interaction recognition. We start by identifying the main challenges related to action recognition from videos and review how they have been addressed by current methods. Based…

Computer Vision and Pattern Recognition · Computer Science 2021-10-06 Alexandros Stergiou

Attentive Action and Context Factorization

We propose a method for human action recognition, one that can localize the spatiotemporal regions that `define' the actions. This is a challenging task due to the subtlety of human actions in video and the co-occurrence of contextual…

Computer Vision and Pattern Recognition · Computer Science 2019-04-12 Yang Wang , Vinh Tran , Gedas Bertasius , Lorenzo Torresani , Minh Hoai