English
Related papers

Related papers: Efficient Spatialtemporal Context Modeling for Act…

200 papers

Action recognition and anticipation are key to the success of many computer vision applications. Existing methods can roughly be grouped into those that extract global, context-aware representations of the entire image or sequence, and…

Computer Vision and Pattern Recognition · Computer Science 2016-11-21 Mohammad Sadegh Aliakbarian , Fatemehsadat Saleh , Basura Fernando , Mathieu Salzmann , Lars Petersson , Lars Andersson

Research in action detection has grown in the recentyears, as it plays a key role in video understanding. Modelling the interactions (either spatial or temporal) between actors and their context has proven to be essential for this task.…

Computer Vision and Pattern Recognition · Computer Science 2021-06-30 Manuel Sarmiento Calderó , David Varas , Elisenda Bou-Balust

Real-time video analysis remains a challenging problem in computer vision, requiring efficient processing of both spatial and temporal information while maintaining computational efficiency. Existing approaches often struggle to balance…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Shahla John

This thesis focuses on video understanding for human action and interaction recognition. We start by identifying the main challenges related to action recognition from videos and review how they have been addressed by current methods. Based…

Computer Vision and Pattern Recognition · Computer Science 2021-10-06 Alexandros Stergiou

Referring Video Segmentation (RVOS) aims to segment objects in videos given linguistic expressions. The key to solving RVOS is to extract long-range temporal context information from the interactions of expressions and videos to depict the…

Computer Vision and Pattern Recognition · Computer Science 2025-10-10 Cilin Yan , Jingyun Wang , Guoliang Kang

Human action recognition is one of the challenging tasks in computer vision. The current action recognition methods use computationally expensive models for learning spatio-temporal dependencies of the action. Models utilizing RGB channels…

Computer Vision and Pattern Recognition · Computer Science 2022-06-07 Labina Shrestha , Shikha Dubey , Farrukh Olimov , Muhammad Aasim Rafique , Moongu Jeon

Spatial and temporal relationships, both short-range and long-range, between objects in videos, are key cues for recognizing actions. It is a challenging problem to model them jointly. In this paper, we first present a new variant of Long…

Computer Vision and Pattern Recognition · Computer Science 2020-04-28 Zexi Chen , Bharathkumar Ramachandra , Tianfu Wu , Ranga Raju Vatsavai

Self-attention learns pairwise interactions to model long-range dependencies, yielding great improvements for video action recognition. In this paper, we seek a deeper understanding of self-attention for temporal modeling in videos. We…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Bo He , Xitong Yang , Zuxuan Wu , Hao Chen , Ser-Nam Lim , Abhinav Shrivastava

Temporal modeling is crucial for various video learning tasks. Most recent approaches employ either factorized (2D+1D) or joint (3D) spatial-temporal operations to extract temporal contexts from the input frames. While the former is more…

Computer Vision and Pattern Recognition · Computer Science 2023-01-03 Yizhou Zhao , Zhenyang Li , Xun Guo , Yan Lu

Attentive video modeling is essential for action recognition in unconstrained videos due to their rich yet redundant information over space and time. However, introducing attention in a deep neural network for action recognition is…

Computer Vision and Pattern Recognition · Computer Science 2020-04-06 Juan-Manuel Perez-Rua , Brais Martinez , Xiatian Zhu , Antoine Toisoul , Victor Escorcia , Tao Xiang

Action recognition is a critical task in video understanding, requiring the comprehensive capture of spatio-temporal cues across various scales. However, existing methods often overlook the multi-granularity nature of actions. To address…

Computer Vision and Pattern Recognition · Computer Science 2025-12-23 Xiaoyang Li , Wenzhu Yang , Kanglin Wang , Tiebiao Wang , Qingsong Fei

Activity detection is a fundamental problem in computer vision. Detecting activities of different temporal scales is particularly challenging. In this paper, we propose the contextual multi-scale region convolutional 3D network (CMS-RC3D)…

Computer Vision and Pattern Recognition · Computer Science 2018-01-30 Yancheng Bai , Huijuan Xu , Kate Saenko , Bernard Ghanem

Recent video recognition models utilize Transformer models for long-range spatio-temporal context modeling. Video transformer designs are based on self-attention that can model global context at a high computational cost. In comparison,…

Computer Vision and Pattern Recognition · Computer Science 2023-10-30 Syed Talal Wasim , Muhammad Uzair Khattak , Muzammal Naseer , Salman Khan , Mubarak Shah , Fahad Shahbaz Khan

We propose a method for human action recognition, one that can localize the spatiotemporal regions that `define' the actions. This is a challenging task due to the subtlety of human actions in video and the co-occurrence of contextual…

Computer Vision and Pattern Recognition · Computer Science 2019-04-12 Yang Wang , Vinh Tran , Gedas Bertasius , Lorenzo Torresani , Minh Hoai

Human action recognition in 3D skeleton sequences has attracted a lot of research attention. Recently, Long Short-Term Memory (LSTM) networks have shown promising performance in this task due to their strengths in modeling the dependencies…

Computer Vision and Pattern Recognition · Computer Science 2018-02-14 Jun Liu , Gang Wang , Ling-Yu Duan , Kamila Abdiyeva , Alex C. Kot

With the rapid development of digital multimedia, video understanding has become an important field. For action recognition, temporal dimension plays an important role, and this is quite different from image recognition. In order to learn…

Computer Vision and Pattern Recognition · Computer Science 2020-02-11 Qian Liu , Tao Wang , Jie Liu , Yang Guan , Qi Bu , Longfei Yang

Deep learning has achieved great success in video recognition, yet still struggles to recognize novel actions when faced with only a few examples. To tackle this challenge, few-shot action recognition methods have been proposed to transfer…

Computer Vision and Pattern Recognition · Computer Science 2023-08-15 Yilun Zhang , Yuqian Fu , Xingjun Ma , Lizhe Qi , Jingjing Chen , Zuxuan Wu , Yu-Gang Jiang

Inspired by the observation that humans are able to process videos efficiently by only paying attention where and when it is needed, we propose an interpretable and easy plug-in spatial-temporal attention mechanism for video action…

Computer Vision and Pattern Recognition · Computer Science 2019-06-04 Lili Meng , Bo Zhao , Bo Chang , Gao Huang , Wei Sun , Frederich Tung , Leonid Sigal

In video-based emotion recognition, audio and visual modalities are often expected to have a complementary relationship, which is widely explored using cross-attention. However, they may also exhibit weak complementary relationships,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-29 R. Gnana Praveen , Jahangir Alam

There is significant progress in recognizing traditional human activities from videos focusing on highly distinctive actions involving discriminative body movements, body-object and/or human-human interactions. Driver's activities are…

Computer Vision and Pattern Recognition · Computer Science 2021-01-19 Zachary Wharton , Ardhendu Behera , Yonghuai Liu , Nik Bessis
‹ Prev 1 2 3 10 Next ›