English
Related papers

Related papers: Visual Semantic Planning using Deep Successor Repr…

200 papers

Prospection, the act of predicting the consequences of many possible futures, is intrinsic to human planning and action, and may even be at the root of consciousness. Surprisingly, this idea has been explored comparatively little in…

Robotics · Computer Science 2018-04-03 Chris Paxton , Yotam Barnoy , Kapil Katyal , Raman Arora , Gregory D. Hager

Visual planning simulates how humans make decisions to achieve desired goals in the form of searching for visual causal transitions between an initial visual state and a final visual goal state. It has become increasingly important in…

Artificial Intelligence · Computer Science 2024-03-28 Yilue Qian , Peiyu Yu , Ying Nian Wu , Yao Su , Wei Wang , Lifeng Fan

Capabilities of inference and prediction are significant components of visual systems. In this paper, we address an important and challenging task of them: visual path prediction. Its goal is to infer the future path for a visual object in…

Computer Vision and Pattern Recognition · Computer Science 2016-12-16 Siyu Huang , Xi Li , Zhongfei Zhang , Zhouzhou He , Fei Wu , Wei Liu , Jinhui Tang , Yueting Zhuang

In this paper, we propose using deep neural architectures (i.e., vision transformers and ResNet) as heuristics for sequential decision-making in robotic manipulation problems. This formulation enables predicting the subset of objects that…

Robotics · Computer Science 2023-08-02 Hongyou Zhou , Ingmar Schubert , Marc Toussaint , Ozgur S. Oguz

What is a good visual representation for autonomous agents? We address this question in the context of semantic visual navigation, which is the problem of a robot finding its way through a complex environment to a target object, e.g. go to…

Computer Vision and Pattern Recognition · Computer Science 2019-07-04 Arsalan Mousavian , Alexander Toshev , Marek Fiser , Jana Kosecka , Ayzaan Wahid , James Davidson

In recent years, deep generative models have been shown to 'imagine' convincing high-dimensional observations such as images, audio, and even video, learning directly from raw data. In this work, we ask how to imagine goal-directed visual…

Machine Learning · Computer Science 2018-07-27 Thanard Kurutach , Aviv Tamar , Ge Yang , Stuart Russell , Pieter Abbeel

Planning with world models offers a powerful paradigm for robotic control. Conventional approaches train a model to predict future frames conditioned on current frames and actions, which can then be used for planning. However, the objective…

Machine Learning · Computer Science 2025-10-23 Jacob Berg , Chuning Zhu , Yanda Bao , Ishan Durugkar , Abhishek Gupta

Building deep reinforcement learning agents that can generalize and adapt to unseen environments remains a fundamental challenge for AI. This paper describes progresses on this challenge in the context of man-made environments, which are…

Machine Learning · Computer Science 2018-10-01 Yi Wu , Yuxin Wu , Aviv Tamar , Stuart Russell , Georgia Gkioxari , Yuandong Tian

Physically rearranging objects is an important capability for embodied agents. Visual room rearrangement evaluates an agent's ability to rearrange objects in a room to a desired goal based solely on visual input. We propose a simple yet…

Computer Vision and Pattern Recognition · Computer Science 2022-08-11 Brandon Trabucco , Gunnar Sigurdsson , Robinson Piramuthu , Gaurav S. Sukhatme , Ruslan Salakhutdinov

Prospection is an important part of how humans come up with new task plans, but has not been explored in depth in robotics. Predicting multiple task-level is a challenging problem that involves capturing both task semantics and continuous…

Machine Learning · Computer Science 2017-11-13 Chris Paxton , Kapil Katyal , Christian Rupprecht , Raman Arora , Gregory D. Hager

We study the task of embodied visual active learning, where an agent is set to explore a 3d environment with the goal to acquire visual scene understanding by actively selecting views for which to request annotation. While accurate on some…

Computer Vision and Pattern Recognition · Computer Science 2020-12-18 David Nilsson , Aleksis Pirinen , Erik Gärtner , Cristian Sminchisescu

In order to autonomously learn wide repertoires of complex skills, robots must be able to learn from their own autonomously collected data, without human supervision. One learning signal that is always available for autonomously collected…

Robotics · Computer Science 2017-10-18 Frederik Ebert , Chelsea Finn , Alex X. Lee , Sergey Levine

In this paper, we propose a deep convolutional recurrent neural network that predicts action sequences for task and motion planning (TAMP) from an initial scene image. Typical TAMP problems are formalized by combining reasoning on a…

Machine Learning · Computer Science 2020-06-11 Danny Driess , Jung-Su Ha , Marc Toussaint

Two less addressed issues of deep reinforcement learning are (1) lack of generalization capability to new target goals, and (2) data inefficiency i.e., the model requires several (and often costly) episodes of trial and error to converge,…

Computer Vision and Pattern Recognition · Computer Science 2016-09-19 Yuke Zhu , Roozbeh Mottaghi , Eric Kolve , Joseph J. Lim , Abhinav Gupta , Li Fei-Fei , Ali Farhadi

How does the mind organize thoughts? The hippocampal-entorhinal complex is thought to support domain-general representation and processing of structural knowledge of arbitrary state, feature and concept spaces. In particular, it enables the…

Artificial Intelligence · Computer Science 2022-02-24 Paul Stoewer , Christian Schlieker , Achim Schilling , Claus Metzner , Andreas Maier , Patrick Krauss

How do humans navigate to target objects in novel scenes? Do we use the semantic/functional priors we have built over years to efficiently search and navigate? For example, to search for mugs, we search cabinets near the coffee machine and…

Computer Vision and Pattern Recognition · Computer Science 2018-10-16 Wei Yang , Xiaolong Wang , Ali Farhadi , Abhinav Gupta , Roozbeh Mottaghi

Deep reinforcement learning (RL) has been successfully applied to a variety of game-like environments. However, the application of deep RL to visual navigation with realistic environments is a challenging task. We propose a novel learning…

Robotics · Computer Science 2019-11-12 Jonáš Kulhánek , Erik Derner , Tim de Bruin , Robert Babuška

Learning visual representations from observing actions to benefit robot visuo-motor policy generation is a promising direction that closely resembles human cognitive function and perception. Motivated by this, and further inspired by…

We consider the problem of object goal navigation in unseen environments. Solving this problem requires learning of contextual semantic priors, a challenging endeavour given the spatial and semantic variability of indoor environments.…

Computer Vision and Pattern Recognition · Computer Science 2022-03-10 Georgios Georgakis , Bernadette Bucher , Karl Schmeckpeper , Siddharth Singh , Kostas Daniilidis

When searching for an object humans navigate through a scene using semantic information and spatial relationships. We look for an object using our knowledge of its attributes and relationships with other objects to infer the probable…

Computer Vision and Pattern Recognition · Computer Science 2018-12-18 Jean-Benoit Delbrouck , Stéphane Dupont
‹ Prev 1 2 3 10 Next ›