Related papers: Long Activity Video Understanding using Functional…

Object Level Visual Reasoning in Videos

Human activity recognition is typically addressed by detecting key concepts like global and local motion, features related to object classes present in the scene, as well as features related to the global context. The next open challenges…

Computer Vision and Pattern Recognition · Computer Science 2018-09-21 Fabien Baradel , Natalia Neverova , Christian Wolf , Julien Mille , Greg Mori

Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition

The interactions between human and objects are important for recognizing object-centric actions. Existing methods usually adopt a two-stage pipeline, where object proposals are first detected using a pretrained detector, and then are fed to…

Computer Vision and Pattern Recognition · Computer Science 2024-04-19 Xunsong Li , Pengzhan Sun , Yangcen Liu , Lixin Duan , Wen Li

Attend and Interact: Higher-Order Object Interactions for Video Understanding

Human actions often involve complex interactions across several inter-related objects in the scene. However, existing approaches to fine-grained video understanding or visual relationship detection often rely on single object representation…

Computer Vision and Pattern Recognition · Computer Science 2018-03-22 Chih-Yao Ma , Asim Kadav , Iain Melvin , Zsolt Kira , Ghassan AlRegib , Hans Peter Graf

Segmenting Moving Objects via an Object-Centric Layered Representation

The objective of this paper is a model that is able to discover, track and segment multiple moving objects in a video. We make four contributions: First, we introduce an object-centric segmentation model with a depth-ordered layer…

Computer Vision and Pattern Recognition · Computer Science 2022-11-15 Junyu Xie , Weidi Xie , Andrew Zisserman

Video Action Recognition Using spatio-temporal optical flow video frames

Recognizing human actions based on videos has became one of the most popular areas of research in computer vision in recent years. This area has many applications such as surveillance, robotics, health care, video search and human-computer…

Computer Vision and Pattern Recognition · Computer Science 2021-03-10 Aytekin Nebisoy , Saber Malekzadeh

Unified Graph Structured Models for Video Understanding

Accurate video understanding involves reasoning about the relationships between actors, objects and their environment, often over long temporal intervals. In this paper, we propose a message passing graph neural network that explicitly…

Computer Vision and Pattern Recognition · Computer Science 2021-03-30 Anurag Arnab , Chen Sun , Cordelia Schmid

Action Recognition using Visual Attention

We propose a soft attention based model for the task of action recognition in videos. We use multi-layered Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units which are deep both spatially and temporally. Our model…

Machine Learning · Computer Science 2016-02-16 Shikhar Sharma , Ryan Kiros , Ruslan Salakhutdinov

How can objects help action recognition?

Current state-of-the-art video models process a video clip as a long sequence of spatio-temporal tokens. However, they do not explicitly model objects, their interactions across the video, and instead process all the tokens in the video. In…

Computer Vision and Pattern Recognition · Computer Science 2023-06-21 Xingyi Zhou , Anurag Arnab , Chen Sun , Cordelia Schmid

Issues in Object Detection in Videos using Common Single-Image CNNs

A growing branch of computer vision is object detection. Object detection is used in many applications such as industrial process, medical imaging analysis, and autonomous vehicles. The ability to detect objects in videos is crucial. Object…

Computer Vision and Pattern Recognition · Computer Science 2021-05-28 Spencer Ploeger , Lucas Dasovic

Deep Learning in Video Multi-Object Tracking: A Survey

The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem…

Computer Vision and Pattern Recognition · Computer Science 2019-11-21 Gioele Ciaparrone , Francisco Luque Sánchez , Siham Tabik , Luigi Troiano , Roberto Tagliaferri , Francisco Herrera

Attention is All We Need: Nailing Down Object-centric Attention for Egocentric Activity Recognition

In this paper we propose an end-to-end trainable deep neural network model for egocentric activity recognition. Our model is built on the observation that egocentric activities are highly characterized by the objects and their locations in…

Computer Vision and Pattern Recognition · Computer Science 2018-08-01 Swathikiran Sudhakaran , Oswald Lanz

Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks

Recognizing human actions is a vital task for a humanoid robot, especially in domains like programming by demonstration. Previous approaches on action recognition primarily focused on the overall prevalent action being executed, but we…

Robotics · Computer Science 2019-09-13 Christian R. G. Dreher , Mirko Wächter , Tamim Asfour

Object Recognition from Short Videos for Robotic Perception

Deep neural networks have become the primary learning technique for object recognition. Videos, unlike still images, are temporally coherent which makes the application of deep networks non-trivial. Here, we investigate how motion can aid…

Computer Vision and Pattern Recognition · Computer Science 2015-09-08 Ivan Bogun , Anelia Angelova , Navdeep Jaitly

Learning To Recognize Procedural Activities with Distant Supervision

In this paper we consider the problem of classifying fine-grained, multi-step activities (e.g., cooking different recipes, making disparate home improvements, creating various forms of arts and crafts) from long videos spanning up to…

Computer Vision and Pattern Recognition · Computer Science 2022-06-20 Xudong Lin , Fabio Petroni , Gedas Bertasius , Marcus Rohrbach , Shih-Fu Chang , Lorenzo Torresani

Object-centric Video Representation for Long-term Action Anticipation

This paper focuses on building object-centric representations for long-term action anticipation in videos. Our key motivation is that objects provide important cues to recognize and predict human-object interactions, especially when the…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Ce Zhang , Changcheng Fu , Shijie Wang , Nakul Agarwal , Kwonjoon Lee , Chiho Choi , Chen Sun

Fast Interactive Video Object Segmentation with Graph Neural Networks

Pixelwise annotation of image sequences can be very tedious for humans. Interactive video object segmentation aims to utilize automatic methods to speed up the process and reduce the workload of the annotators. Most contemporary approaches…

Computer Vision and Pattern Recognition · Computer Science 2021-04-22 Viktor Varga , András Lőrincz

Interpretable Action Recognition on Hard to Classify Actions

We investigate a human-like interpretable model of video understanding. Humans recognise complex activities in video by recognising critical spatio-temporal relations among explicitly recognised objects and parts, for example, an object…

Computer Vision and Pattern Recognition · Computer Science 2024-09-23 Anastasia Anichenko , Frank Guerin , Andrew Gilbert

Dynamic Graph Modules for Modeling Object-Object Interactions in Activity Recognition

Video action recognition, a critical problem in video understanding, has been gaining increasing attention. To identify actions induced by complex object-object interactions, we need to consider not only spatial relations among objects in a…

Computer Vision and Pattern Recognition · Computer Science 2019-05-08 Hao Huang , Luowei Zhou , Wei Zhang , Jason J. Corso , Chenliang Xu

Human activity recognition using deep learning approaches and single frame cnn and convolutional lstm

Human activity recognition is one of the most important tasks in computer vision and has proved useful in different fields such as healthcare, sports training and security. There are a number of approaches that have been explored to solve…

Computer Vision and Pattern Recognition · Computer Science 2023-05-01 Sheryl Mathew , Annapoorani Subramanian , Pooja , Balamurugan MS , Manoj Kumar Rajagopal

Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks

We present a deep learning method for the interactive video object segmentation. Our method is built upon two core operations, interaction and propagation, and each operation is conducted by Convolutional Neural Networks. The two networks…

Computer Vision and Pattern Recognition · Computer Science 2019-05-03 Seoung Wug Oh , Joon-Young Lee , Ning Xu , Seon Joo Kim