Related papers: Sequence-Agnostic Multi-Object Navigation

MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation

Navigation tasks in photorealistic 3D environments are challenging because they require perception and effective planning under partial observability. Recent work shows that map-like memory is useful for long-horizon navigation tasks.…

Computer Vision and Pattern Recognition · Computer Science 2020-12-08 Saim Wani , Shivansh Patel , Unnat Jain , Angel X. Chang , Manolis Savva

Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation

The task of Visual Object Navigation (VON) involves an agent's ability to locate a particular object within a given scene. In order to successfully accomplish the VON task, two essential conditions must be fulfilled:1) the user must know…

Robotics · Computer Science 2023-11-07 Hongcheng Wang , Andy Guan Hong Chen , Xiaoqi Li , Mingdong Wu , Hao Dong

Object Goal Navigation using Data Regularized Q-Learning

Object Goal Navigation requires a robot to find and navigate to an instance of a target object class in a previously unseen environment. Our framework incrementally builds a semantic map of the environment over time, and then repeatedly…

Robotics · Computer Science 2022-08-30 Nandiraju Gireesh , D. A. Sasi Kiran , Snehasis Banerjee , Mohan Sridharan , Brojeshwar Bhowmick , Madhava Krishna

Multi-Object Navigation in real environments using hybrid policies

Navigation has been classically solved in robotics through the combination of SLAM and planning. More recently, beyond waypoint planning, problems involving significant components of (visual) high-level reasoning have been explored in…

Robotics · Computer Science 2024-01-26 Assem Sadek , Guillaume Bono , Boris Chidlovskii , Atilla Baskurt , Christian Wolf

Auxiliary Tasks and Exploration Enable ObjectNav

ObjectGoal Navigation (ObjectNav) is an embodied task wherein agents are to navigate to an object instance in an unseen environment. Prior works have shown that end-to-end ObjectNav agents that use vanilla visual and recurrent modules, e.g.…

Computer Vision and Pattern Recognition · Computer Science 2021-08-04 Joel Ye , Dhruv Batra , Abhishek Das , Erik Wijmans

Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views

Learning object-centric representations of multi-object scenes is a promising approach towards machine intelligence, facilitating high-level reasoning and control from visual sensory data. However, current approaches for unsupervised…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Li Nanbo , Cian Eastwood , Robert B. Fisher

Vision-based Navigation Using Deep Reinforcement Learning

Deep reinforcement learning (RL) has been successfully applied to a variety of game-like environments. However, the application of deep RL to visual navigation with realistic environments is a challenging task. We propose a novel learning…

Robotics · Computer Science 2019-11-12 Jonáš Kulhánek , Erik Derner , Tim de Bruin , Robert Babuška

Object-Oriented Dynamics Learning through Multi-Level Abstraction

Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability. However, existing approaches suffer from structural limitations and optimization difficulties for common…

Machine Learning · Computer Science 2019-12-06 Guangxiang Zhu , Jianhao Wang , Zhizhou Ren , Zichuan Lin , Chongjie Zhang

Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers

Online Multi-Object Tracking (MOT) from videos is a challenging computer vision task which has been extensively studied for decades. Most of the existing MOT algorithms are based on the Tracking-by-Detection (TBD) paradigm combined with…

Computer Vision and Pattern Recognition · Computer Science 2019-04-10 Zhen He , Jian Li , Daxue Liu , Hangen He , David Barber

Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised…

Computer Vision and Pattern Recognition · Computer Science 2016-05-30 Ramazan Gokberk Cinbis , Jakob Verbeek , Cordelia Schmid

Active Object Perceiver: Recognition-guided Policy Learning for Object Searching on Mobile Robots

We study the problem of learning a navigation policy for a robot to actively search for an object of interest in an indoor environment solely from its visual inputs. While scene-driven visual navigation has been widely studied, prior…

Artificial Intelligence · Computer Science 2018-07-31 Xin Ye , Zhe Lin , Haoxiang Li , Shibin Zheng , Yezhou Yang

Multi-Objective Convolutional Neural Networks for Robot Localisation and 3D Position Estimation in 2D Camera Images

The field of collaborative robotics and human-robot interaction often focuses on the prediction of human behaviour, while assuming the information about the robot setup and configuration being known. This is often the case with fixed…

Robotics · Computer Science 2019-02-18 Justinas Miseikis , Inka Brijacak , Saeed Yahyanejad , Kyrre Glette , Ole Jakob Elle , Jim Torresen

MO-DDN: A Coarse-to-Fine Attribute-based Exploration Agent for Multi-object Demand-driven Navigation

The process of satisfying daily demands is a fundamental aspect of humans' daily lives. With the advancement of embodied AI, robots are increasingly capable of satisfying human demands. Demand-driven navigation (DDN) is a task in which an…

Robotics · Computer Science 2024-10-07 Hongcheng Wang , Peiqi Liu , Wenzhe Cai , Mingdong Wu , Zhengyu Qian , Hao Dong

Object-Centric Representation Learning with Generative Spatial-Temporal Factorization

Learning object-centric scene representations is essential for attaining structural understanding and abstraction of complex scenes. Yet, as current approaches for unsupervised object-centric representation learning are built upon either a…

Machine Learning · Computer Science 2021-11-11 Li Nanbo , Muhammad Ahmed Raza , Hu Wenbin , Zhaole Sun , Robert B. Fisher

End-to-end Tracking with a Multi-query Transformer

Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time. Our aim in this paper is to move beyond tracking-by-detection…

Computer Vision and Pattern Recognition · Computer Science 2022-10-27 Bruno Korbar , Andrew Zisserman

Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning

Two less addressed issues of deep reinforcement learning are (1) lack of generalization capability to new target goals, and (2) data inefficiency i.e., the model requires several (and often costly) episodes of trial and error to converge,…

Computer Vision and Pattern Recognition · Computer Science 2016-09-19 Yuke Zhu , Roozbeh Mottaghi , Eric Kolve , Joseph J. Lim , Abhinav Gupta , Li Fei-Fei , Ali Farhadi

Learning to Navigate in Complex Environments

Learning to navigate in complex environments with dynamic elements is an important milestone in developing AI agents. In this work we formulate the navigation question as a reinforcement learning problem and show that data efficiency and…

Artificial Intelligence · Computer Science 2017-01-16 Piotr Mirowski , Razvan Pascanu , Fabio Viola , Hubert Soyer , Andrew J. Ballard , Andrea Banino , Misha Denil , Ross Goroshin , Laurent Sifre , Koray Kavukcuoglu , Dharshan Kumaran , Raia Hadsell

ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings

We present a scalable approach for learning open-world object-goal navigation (ObjectNav) -- the task of asking a virtual robot (agent) to find any instance of an object in an unexplored environment (e.g., "find a sink"). Our approach is…

Computer Vision and Pattern Recognition · Computer Science 2023-10-16 Arjun Majumdar , Gunjan Aggarwal , Bhavika Devnani , Judy Hoffman , Dhruv Batra

Multiple Object Recognition with Visual Attention

We present an attention-based model for recognizing multiple objects in images. The proposed model is a deep recurrent neural network trained with reinforcement learning to attend to the most relevant regions of the input image. We show…

Machine Learning · Computer Science 2015-04-24 Jimmy Ba , Volodymyr Mnih , Koray Kavukcuoglu

Multiple instance active learning for object detection

Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection. In this paper, we propose Multiple Instance Active Object Detection…

Computer Vision and Pattern Recognition · Computer Science 2021-04-07 Tianning Yuan , Fang Wan , Mengying Fu , Jianzhuang Liu , Songcen Xu , Xiangyang Ji , Qixiang Ye