Related papers: Recognizing Manipulation Actions from State-Transf…

Joint Discovery of Object States and Manipulation Actions

Many human activities involve object manipulations aiming to modify the object state. Examples of common state changes include full/empty bottle, open/closed door, and attached/detached car wheel. In this work, we seek to automatically…

Computer Vision and Pattern Recognition · Computer Science 2017-08-29 Jean-Baptiste Alayrac , Josev Sivic , Ivan Laptev , Simon Lacoste-Julien

Understanding hand-object manipulation by modeling the contextual relationship between actions, grasp types and object attributes

This paper proposes a novel method for understanding daily hand-object manipulation by developing computer vision-based techniques. Specifically, we focus on recognizing hand grasp types, object attributes and manipulation actions within an…

Computer Vision and Pattern Recognition · Computer Science 2018-07-24 Minjie Cai , Kris Kitani , Yoichi Sato

Hand-Object Interaction and Precise Localization in Transitive Action Recognition

Action recognition in still images has seen major improvement in recent years due to advances in human pose estimation, object recognition and stronger feature representations produced by deep neural networks. However, there are still many…

Computer Vision and Pattern Recognition · Computer Science 2016-02-25 Amir Rosenfeld , Shimon Ullman

Learning Object Manipulation Skills via Approximate State Estimation from Real Videos

Humans are adept at learning new tasks by watching a few instructional videos. On the other hand, robots that learn new actions either require a lot of effort through trial and error, or use expert demonstrations that are challenging to…

Robotics · Computer Science 2020-11-16 Vladimír Petrík , Makarand Tapaswi , Ivan Laptev , Josef Sivic

Spatial Action Maps for Mobile Manipulation

Typical end-to-end formulations for learning robotic navigation involve predicting a small set of steering command actions (e.g., step forward, turn left, turn right, etc.) from images of the current state (e.g., a bird's-eye view of a SLAM…

Robotics · Computer Science 2020-10-13 Jimmy Wu , Xingyuan Sun , Andy Zeng , Shuran Song , Johnny Lee , Szymon Rusinkiewicz , Thomas Funkhouser

Action Recognition and State Change Prediction in a Recipe Understanding Task Using a Lightweight Neural Network Model

Consider a natural language sentence describing a specific step in a food recipe. In such instructions, recognizing actions (such as press, bake, etc.) and the resulting changes in the state of the ingredients (shape molded, custard cooked,…

Computation and Language · Computer Science 2020-01-24 Qing Wan , Yoonsuck Choe

Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos

Human actions often induce changes of object states such as "cutting an apple", "cleaning shoes" or "pouring coffee". In this paper, we seek to temporally localize object states (e.g. "empty" and "full" cup) together with the corresponding…

Computer Vision and Pattern Recognition · Computer Science 2022-03-23 Tomáš Souček , Jean-Baptiste Alayrac , Antoine Miech , Ivan Laptev , Josef Sivic

Learning Manipulation by Predicting Interaction

Representation learning approaches for robotic manipulation have boomed in recent years. Due to the scarcity of in-domain robot data, prevailing methodologies tend to leverage large-scale human video datasets to extract generalizable…

Robotics · Computer Science 2024-06-04 Jia Zeng , Qingwen Bu , Bangjun Wang , Wenke Xia , Li Chen , Hao Dong , Haoming Song , Dong Wang , Di Hu , Ping Luo , Heming Cui , Bin Zhao , Xuelong Li , Yu Qiao , Hongyang Li

Event-Driven Proactive Assistive Manipulation with Grounded Vision-Language Planning

Assistance in collaborative manipulation is often initiated by user instructions, making high-level reasoning request-driven. In fluent human teamwork, however, partners often infer the next helpful step from the observed outcome of an…

Robotics · Computer Science 2026-03-26 Fengkai Liu , Hao Su , Haozhuang Chi , Rui Geng , Congzhi Ren , Xuqing Liu , Yucheng Xu , Yuichi Ohsita , Liyun Zhang

Explainable Video Action Reasoning via Prior Knowledge and State Transitions

Human action analysis and understanding in videos is an important and challenging task. Although substantial progress has been made in past years, the explainability of existing methods is still limited. In this work, we propose a novel…

Computer Vision and Pattern Recognition · Computer Science 2019-08-29 Tao Zhuo , Zhiyong Cheng , Peng Zhang , Yongkang Wong , Mohan Kankanhalli

Modelling Spatio-Temporal Interactions For Compositional Action Recognition

Humans have the natural ability to recognize actions even if the objects involved in the action or the background are changed. Humans can abstract away the action from the appearance of the objects which is referred to as compositionality…

Computer Vision and Pattern Recognition · Computer Science 2024-10-28 Ramanathan Rajendiran , Debaditya Roy , Basura Fernando

ARTiS: Appearance-based Action Recognition in Task Space for Real-Time Human-Robot Collaboration

To have a robot actively supporting a human during a collaborative task, it is crucial that robots are able to identify the current action in order to predict the next one. Common approaches make use of high-level knowledge, such as object…

Robotics · Computer Science 2017-03-08 Markus Eich , Sareh Shirazi , Gordon Wyeth

How Do We Move: Modeling Human Movement with System Dynamics

Modeling how human moves in the space is useful for policy-making in transportation, public safety, and public health. Human movements can be viewed as a dynamic process that human transits between states (\eg, locations) over time. In the…

Artificial Intelligence · Computer Science 2021-03-24 Hua Wei , Dongkuan Xu , Junjie Liang , Zhenhui Li

Action Recognition based on Cross-Situational Action-object Statistics

Machine learning models of visual action recognition are typically trained and tested on data from specific situations where actions are associated with certain objects. It is an open question how action-object associations in the training…

Computer Vision and Pattern Recognition · Computer Science 2022-08-16 Satoshi Tsutsui , Xizi Wang , Guangyuan Weng , Yayun Zhang , David Crandall , Chen Yu

VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects

Perceiving and manipulating 3D articulated objects (e.g., cabinets, doors) in human environments is an important yet challenging task for future home-assistant robots. The space of 3D articulated objects is exceptionally rich in their…

Computer Vision and Pattern Recognition · Computer Science 2022-04-04 Ruihai Wu , Yan Zhao , Kaichun Mo , Zizheng Guo , Yian Wang , Tianhao Wu , Qingnan Fan , Xuelin Chen , Leonidas Guibas , Hao Dong

Vision-based Robot Manipulation Learning via Human Demonstrations

Vision-based learning methods provide promise for robots to learn complex manipulation tasks. However, how to generalize the learned manipulation skills to real-world interactions remains an open question. In this work, we study robotic…

Robotics · Computer Science 2020-03-03 Zhixin Jia , Mengxiang Lin , Zhixin Chen , Shibo Jian

Motion Perception in Reinforcement Learning with Dynamic Objects

In dynamic environments, learned controllers are supposed to take motion into account when selecting the action to be taken. However, in existing reinforcement learning works motion is rarely treated explicitly; it is rather assumed that…

Machine Learning · Computer Science 2019-02-04 Artemij Amiranashvili , Alexey Dosovitskiy , Vladlen Koltun , Thomas Brox

Understanding of Object Manipulation Actions Using Human Multi-Modal Sensory Data

Object manipulation actions represent an important share of the Activities of Daily Living (ADLs). In this work, we study how to enable service robots to use human multi-modal data to understand object manipulation actions, and how they can…

Robotics · Computer Science 2019-07-09 Bahareh Abbasi , Ehsan Noohi , Sina Parastegari , Milos Zefran

State Classification with CNN

There is a plenty of research going on in field of object recognition, but object state recognition has not been addressed as much. There are many important applications which can utilize object state recognition, such as, in robotics, to…

Computer Vision and Pattern Recognition · Computer Science 2018-06-27 Astha Sharma

Recognition of Heat-Induced Food State Changes by Time-Series Use of Vision-Language Model for Cooking Robot

Cooking tasks are characterized by large changes in the state of the food, which is one of the major challenges in robot execution of cooking tasks. In particular, cooking using a stove to apply heat to the foodstuff causes many special…

Robotics · Computer Science 2023-09-07 Naoaki Kanazawa , Kento Kawaharazuka , Yoshiki Obinata , Kei Okada , Masayuki Inaba