Related papers: Forecasting Hands and Objects in Future Frames

One-Step Time-Dependent Future Video Frame Prediction with a Convolutional Encoder-Decoder Neural Network

There is an inherent need for autonomous cars, drones, and other robots to have a notion of how their environment behaves and to anticipate changes in the near future. In this work, we focus on anticipating future appearance given the…

Computer Vision and Pattern Recognition · Computer Science 2017-07-25 Vedran Vukotić , Silvia-Laura Pintea , Christian Raymond , Guillaume Gravier , Jan Van Gemert

Future Object Detection with Spatiotemporal Transformers

We propose the task Future Object Detection, in which the goal is to predict the bounding boxes for all visible objects in a future video frame. While this task involves recognizing temporal and kinematic patterns, in addition to the…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Adam Tonderski , Joakim Johnander , Christoffer Petersson , Kalle Åström

Future Video Synthesis with Object Motion Prediction

We present an approach to predict future video frames given a sequence of continuous video frames in the past. Instead of synthesizing images directly, our approach is designed to understand the complex scene dynamics by decoupling the…

Computer Vision and Pattern Recognition · Computer Science 2020-04-16 Yue Wu , Rongrong Gao , Jaesik Park , Qifeng Chen

VideoPose: Estimating 6D object pose from videos

We introduce a simple yet effective algorithm that uses convolutional neural networks to directly estimate object poses from videos. Our approach leverages the temporal information from a video sequence, and is computationally efficient and…

Computer Vision and Pattern Recognition · Computer Science 2021-11-23 Apoorva Beedu , Zhile Ren , Varun Agrawal , Irfan Essa

Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression

We design a new approach that allows robot learning of new activities from unlabeled human example videos. Given videos of humans executing the same activity from a human's viewpoint (i.e., first-person videos), our objective is to make the…

Robotics · Computer Science 2017-07-25 Jangwon Lee , Michael S. Ryoo

Future Person Localization in First-Person Videos

We present a new task that predicts future locations of people observed in first-person videos. Consider a first-person video stream continuously recorded by a wearable camera. Given a short clip of a person that is extracted from the…

Computer Vision and Pattern Recognition · Computer Science 2018-03-29 Takuma Yagi , Karttikeya Mangalam , Ryo Yonetani , Yoichi Sato

Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames

Recent advances in deep learning have significantly improved performance of video prediction. However, state-of-the-art methods still suffer from blurriness and distortions in their future predictions, especially when there are large…

Computer Vision and Pattern Recognition · Computer Science 2020-03-20 Osamu Shouno

Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video

We address the challenging task of anticipating human-object interaction in first person videos. Most existing methods ignore how the camera wearer interacts with the objects, or simply consider body motion as a separate modality. In…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Miao Liu , Siyu Tang , Yin Li , James Rehg

Fast and Accurate 3D Hand Pose Estimation via Recurrent Neural Network for Capturing Hand Articulations

3D hand pose estimation from a single depth image plays an important role in computer vision and human-computer interaction. Although recent hand pose estimation methods using convolution neural network (CNN) have shown notable improvements…

Computer Vision and Pattern Recognition · Computer Science 2020-08-28 Cheol-hwan Yoo , Seo-won Ji , Yong-goo Shin , Seung-wook Kim , Sung-jea Ko

Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

We develop a system for modeling hand-object interactions in 3D from RGB images that show a hand which is holding a novel object from a known category. We design a Convolutional Neural Network (CNN) for Hand-held Object Pose and Shape…

Computer Vision and Pattern Recognition · Computer Science 2019-11-12 Mia Kokic , Danica Kragic , Jeannette Bohg

Graphing the Future: Activity and Next Active Object Prediction using Graph-based Activity Representations

We present a novel approach for the visual prediction of human-object interactions in videos. Rather than forecasting the human and object motion or the future hand-object contact points, we aim at predicting (a)the class of the on-going…

Computer Vision and Pattern Recognition · Computer Science 2022-09-13 Victoria Manousaki , Konstantinos Papoutsakis , Antonis Argyros

Predicting the Future with Transformational States

An intelligent observer looks at the world and sees not only what is, but what is moving and what can be moved. In other words, the observer sees how the present state of the world can transform in the future. We propose a model that…

Computer Vision and Pattern Recognition · Computer Science 2018-03-28 Andrew Jaegle , Oleh Rybkin , Konstantinos G. Derpanis , Kostas Daniilidis

Detecting Hands and Recognizing Physical Contact in the Wild

We investigate a new problem of detecting hands and recognizing their physical contact state in unconstrained conditions. This is a challenging inference task given the need to reason beyond the local appearance of hands. The lack of…

Computer Vision and Pattern Recognition · Computer Science 2020-10-20 Supreeth Narasimhaswamy , Trung Nguyen , Minh Hoai

Cubic LSTMs for Video Prediction

Predicting future frames in videos has become a promising direction of research for both computer vision and robot learning communities. The core of this problem involves moving object capture and future motion prediction. While object…

Computer Vision and Pattern Recognition · Computer Science 2019-04-23 Hehe Fan , Linchao Zhu , Yi Yang

Object-centric Video Representation for Long-term Action Anticipation

This paper focuses on building object-centric representations for long-term action anticipation in videos. Our key motivation is that objects provide important cues to recognize and predict human-object interactions, especially when the…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Ce Zhang , Changcheng Fu , Shijie Wang , Nakul Agarwal , Kwonjoon Lee , Chiho Choi , Chen Sun

Joint Hand Detection and Rotation Estimation by Using CNN

Hand detection is essential for many hand related tasks, e.g. parsing hand pose, understanding gesture, which are extremely useful for robotics and human-computer interaction. However, hand detection in uncontrolled environments is…

Computer Vision and Pattern Recognition · Computer Science 2016-12-09 Xiaoming Deng , Ye Yuan , Yinda Zhang , Ping Tan , Liang Chang , Shuo Yang , Hongan Wang

A Novel Hand Gesture Detection and Recognition system based on ensemble-based Convolutional Neural Network

Nowadays, hand gesture recognition has become an alternative for human-machine interaction. It has covered a large area of applications like 3D game technology, sign language interpreting, VR (virtual reality) environment, and robotics. But…

Computer Vision and Pattern Recognition · Computer Science 2022-02-28 Abir Sen , Tapas Kumar Mishra , Ratnakar Dash

Learn to Predict How Humans Manipulate Large-sized Objects from Interactive Motions

Understanding human intentions during interactions has been a long-lasting theme, that has applications in human-robot interaction, virtual reality and surveillance. In this study, we focus on full-body human interactions with large-sized…

Computer Vision and Pattern Recognition · Computer Science 2022-06-28 Weilin Wan , Lei Yang , Lingjie Liu , Zhuoying Zhang , Ruixing Jia , Yi-King Choi , Jia Pan , Christian Theobalt , Taku Komura , Wenping Wang

Learning to Forecast Videos of Human Activity with Multi-granularity Models and Adaptive Rendering

We propose an approach for forecasting video of complex human activity involving multiple people. Direct pixel-level prediction is too simple to handle the appearance variability in complex activities. Hence, we develop novel intermediate…

Computer Vision and Pattern Recognition · Computer Science 2017-12-07 Mengyao Zhai , Jiacheng Chen , Ruizhi Deng , Lei Chen , Ligeng Zhu , Greg Mori

Research Progress of Convolutional Neural Network and its Application in Object Detection

With the improvement of computer performance and the increase of data volume, the object detection based on convolutional neural network (CNN) has become the main algorithm for object detection. This paper summarizes the research progress…

Computer Vision and Pattern Recognition · Computer Science 2020-07-28 Wei Zhang , Zuoxiang Zeng