Related papers: Hand-Object Interaction Reasoning

Human Hands as Probes for Interactive Object Understanding

Interactive object understanding, or what we can do to objects and how is a long-standing goal of computer vision. In this paper, we tackle this problem through observation of human hands in in-the-wild egocentric videos. We demonstrate…

Computer Vision and Pattern Recognition · Computer Science 2022-04-11 Mohit Goyal , Sahil Modi , Rishabh Goyal , Saurabh Gupta

Object Level Visual Reasoning in Videos

Human activity recognition is typically addressed by detecting key concepts like global and local motion, features related to object classes present in the scene, as well as features related to the global context. The next open challenges…

Computer Vision and Pattern Recognition · Computer Science 2018-09-21 Fabien Baradel , Natalia Neverova , Christian Wolf , Julien Mille , Greg Mori

Understanding hand-object manipulation by modeling the contextual relationship between actions, grasp types and object attributes

This paper proposes a novel method for understanding daily hand-object manipulation by developing computer vision-based techniques. Specifically, we focus on recognizing hand grasp types, object attributes and manipulation actions within an…

Computer Vision and Pattern Recognition · Computer Science 2018-07-24 Minjie Cai , Kris Kitani , Yoichi Sato

Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video

We address the challenging task of anticipating human-object interaction in first person videos. Most existing methods ignore how the camera wearer interacts with the objects, or simply consider body motion as a separate modality. In…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Miao Liu , Siyu Tang , Yin Li , James Rehg

Reasoning About Human-Object Interactions Through Dual Attention Networks

Objects are entities we act upon, where the functionality of an object is determined by how we interact with it. In this work we propose a Dual Attention Network model which reasons about human-object interactions. The dual-attentional…

Computer Vision and Pattern Recognition · Computer Science 2019-09-12 Tete Xiao , Quanfu Fan , Dan Gutfreund , Mathew Monfort , Aude Oliva , Bolei Zhou

Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos

We propose to forecast future hand-object interactions given an egocentric video. Instead of predicting action labels or pixels, we directly predict the hand motion trajectory and the future contact points on the next active object (i.e.,…

Computer Vision and Pattern Recognition · Computer Science 2022-04-05 Shaowei Liu , Subarna Tripathi , Somdeb Majumdar , Xiaolong Wang

Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition

The interactions between human and objects are important for recognizing object-centric actions. Existing methods usually adopt a two-stage pipeline, where object proposals are first detected using a pretrained detector, and then are fed to…

Computer Vision and Pattern Recognition · Computer Science 2024-04-19 Xunsong Li , Pengzhan Sun , Yangcen Liu , Lixin Duan , Wen Li

Egocentric Hand-object Interaction Detection

In this paper, we propose a method to jointly determine the status of hand-object interaction. This is crucial for egocentric human activity understanding and interaction. From a computer vision perspective, we believe that determining…

Computer Vision and Pattern Recognition · Computer Science 2022-11-17 Yao Lu , Yanan Liu

Modelling Spatio-Temporal Interactions For Compositional Action Recognition

Humans have the natural ability to recognize actions even if the objects involved in the action or the background are changed. Humans can abstract away the action from the appearance of the objects which is referred to as compositionality…

Computer Vision and Pattern Recognition · Computer Science 2024-10-28 Ramanathan Rajendiran , Debaditya Roy , Basura Fernando

Forecasting Action through Contact Representations from First Person Video

Human actions involving hand manipulations are structured according to the making and breaking of hand-object contact, and human visual understanding of action is reliant on anticipation of contact as is demonstrated by pioneering work in…

Computer Vision and Pattern Recognition · Computer Science 2021-02-02 Eadom Dessalene , Chinmaya Devaraj , Michael Maynord , Cornelia Fermuller , Yiannis Aloimonos

Modeling long-term interactions to enhance action recognition

In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels. At the frame level, we use a region-based approach that takes as…

Computer Vision and Pattern Recognition · Computer Science 2021-04-26 Alejandro Cartas , Petia Radeva , Mariella Dimiccoli

H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions

We present a unified framework for understanding 3D hand and object interactions in raw image sequences from egocentric RGB cameras. Given a single RGB image, our model jointly estimates the 3D hand and object poses, models their…

Computer Vision and Pattern Recognition · Computer Science 2019-04-11 Bugra Tekin , Federica Bogo , Marc Pollefeys

H2O: Two Hands Manipulating Objects for First Person Interaction Recognition

We present a comprehensive framework for egocentric interaction recognition using markerless 3D annotations of two hands manipulating objects. To this end, we propose a method to create a unified dataset for egocentric 3D interaction…

Computer Vision and Pattern Recognition · Computer Science 2021-08-25 Taein Kwon , Bugra Tekin , Jan Stuhmer , Federica Bogo , Marc Pollefeys

Egocentric View Hand Action Recognition by Leveraging Hand Surface and Hand Grasp Type

We introduce a multi-stage framework that uses mean curvature on a hand surface and focuses on learning interaction between hand and object by analyzing hand grasp type for hand action recognition in egocentric videos. The proposed method…

Computer Vision and Pattern Recognition · Computer Science 2021-09-09 Sangpil Kim , Jihyun Bae , Hyunggun Chi , Sunghee Hong , Byoung Soo Koh , Karthik Ramani

Egocentric Hand Track and Object-based Human Action Recognition

Egocentric vision is an emerging field of computer vision that is characterized by the acquisition of images and video from the first person perspective. In this paper we address the challenge of egocentric human action recognition by…

Computer Vision and Pattern Recognition · Computer Science 2019-05-03 Georgios Kapidis , Ronald Poppe , Elsbeth van Dam , Lucas P. J. J. Noldus , Remco C. Veltkamp

Predicting Human Interaction via Relative Attention Model

Predicting human interaction is challenging as the on-going activity has to be inferred based on a partially observed video. Essentially, a good algorithm should effectively model the mutual influence between the two interacting subjects.…

Computer Vision and Pattern Recognition · Computer Science 2017-05-29 Yichao Yan , Bingbing Ni , Xiaokang Yang

Fine-grained Affordance Annotation for Egocentric Hand-Object Interaction Videos

Object affordance is an important concept in hand-object interaction, providing information on action possibilities based on human motor capacity and objects' physical property thus benefiting tasks such as action anticipation and robot…

Computer Vision and Pattern Recognition · Computer Science 2023-02-13 Zecheng Yu , Yifei Huang , Ryosuke Furuta , Takuma Yagi , Yusuke Goutsu , Yoichi Sato

Opening the Vocabulary of Egocentric Actions

Human actions in egocentric videos are often hand-object interactions composed from a verb (performed by the hand) applied to an object. Despite their extensive scaling up, egocentric datasets still face two limitations - sparsity of action…

Computer Vision and Pattern Recognition · Computer Science 2023-12-13 Dibyadip Chatterjee , Fadime Sener , Shugao Ma , Angela Yao

Interaction Region Visual Transformer for Egocentric Action Anticipation

Human-object interaction is one of the most important visual cues and we propose a novel way to represent human-object interactions for egocentric action anticipation. We propose a novel transformer variant to model interactions by…

Computer Vision and Pattern Recognition · Computer Science 2024-01-12 Debaditya Roy , Ramanathan Rajendiran , Basura Fernando

Helping Hands: An Object-Aware Ego-Centric Video Recognition Model

We introduce an object-aware decoder for improving the performance of spatio-temporal representations on ego-centric videos. The key idea is to enhance object-awareness during training by tasking the model to predict hand positions, object…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Chuhan Zhang , Ankush Gupta , Andrew Zisserman