Related papers: Memory Proxy Maps for Visual Navigation

Feudal Networks for Visual Navigation

Visual navigation follows the intuition that humans can navigate without detailed maps. A common approach is interactive exploration while building a topological graph with images at nodes that can be used for planning. Recent variations…

Computer Vision and Pattern Recognition · Computer Science 2024-12-16 Faith Johnson , Bryan Bo Cao , Ashwin Ashok , Shubham Jain , Kristin Dana

FeudalNav: A Simple Framework for Visual Navigation

Visual navigation for robotics is inspired by the human ability to navigate environments using visual cues and memory, eliminating the need for detailed maps. In unseen, unmapped, or GPS-denied settings, traditional metric map-based methods…

Robotics · Computer Science 2026-04-27 Faith Johnson , Bryan Bo Cao , Shubham Jain , Ashwin Ashok , Kristin Dana

Semi-parametric Topological Memory for Navigation

We introduce a new memory architecture for navigation in previously unseen environments, inspired by landmark-based navigation in animals. The proposed semi-parametric topological memory (SPTM) consists of a (non-parametric) graph with…

Machine Learning · Computer Science 2018-03-05 Nikolay Savinov , Alexey Dosovitskiy , Vladlen Koltun

Predicting Topological Maps for Visual Navigation in Unexplored Environments

We propose a robotic learning system for autonomous exploration and navigation in unexplored environments. We are motivated by the idea that even an unseen environment may be familiar from previous experiences in similar environments. The…

Robotics · Computer Science 2022-11-24 Huangying Zhan , Hamid Rezatofighi , Ian Reid

Improving Vision-and-Language Navigation by Generating Future-View Image Semantics

Vision-and-Language Navigation (VLN) is the task that requires an agent to navigate through the environment based on natural language instructions. At each step, the agent takes the next action by selecting from a set of navigable…

Computer Vision and Pattern Recognition · Computer Science 2023-04-12 Jialu Li , Mohit Bansal

A Behavioral Approach to Visual Navigation with Graph Localization Networks

Inspired by research in psychology, we introduce a behavioral approach for visual navigation using topological maps. Our goal is to enable a robot to navigate from one location to another, relying only on its visual input and the…

Computer Vision and Pattern Recognition · Computer Science 2019-03-04 Kevin Chen , Juan Pablo de Vicente , Gabriel Sepulveda , Fei Xia , Alvaro Soto , Marynel Vazquez , Silvio Savarese

Learning Your Way Without Map or Compass: Panoramic Target Driven Visual Navigation

We present a robot navigation system that uses an imitation learning framework to successfully navigate in complex environments. Our framework takes a pre-built 3D scan of a real environment and trains an agent from pre-generated expert…

Robotics · Computer Science 2020-09-28 David Watkins-Valls , Jingxi Xu , Nicholas Waytowich , Peter Allen

Multi-Object Navigation with dynamically learned neural implicit representations

Understanding and mapping a new environment are core abilities of any autonomously navigating agent. While classical robotics usually estimates maps in a stand-alone manner with SLAM variants, which maintain a topological or metric…

Computer Vision and Pattern Recognition · Computer Science 2023-09-28 Pierre Marza , Laetitia Matignon , Olivier Simonin , Christian Wolf

Masked Path Modeling for Vision-and-Language Navigation

Vision-and-language navigation (VLN) agents are trained to navigate in real-world environments by following natural language instructions. A major challenge in VLN is the limited availability of training data, which hinders the models'…

Computer Vision and Pattern Recognition · Computer Science 2023-05-24 Zi-Yi Dou , Feng Gao , Nanyun Peng

MemoNav: Working Memory Model for Visual Navigation

Image-goal navigation is a challenging task that requires an agent to navigate to a goal indicated by an image in unfamiliar environments. Existing methods utilizing diverse scene memories suffer from inefficient exploration since they use…

Computer Vision and Pattern Recognition · Computer Science 2024-03-29 Hongxin Li , Zeyu Wang , Xu Yang , Yuran Yang , Shuqi Mei , Zhaoxiang Zhang

MemoNav: Selecting Informative Memories for Visual Navigation

Image-goal navigation is a challenging task, as it requires the agent to navigate to a target indicated by an image in a previously unseen scene. Current methods introduce diverse memory mechanisms which save navigation history to solve…

Computer Vision and Pattern Recognition · Computer Science 2022-08-23 Hongxin Li , Xu Yang , Yuran Yang , Shuqi Mei , Zhaoxiang Zhang

GridMM: Grid Memory Map for Vision-and-Language Navigation

Vision-and-language navigation (VLN) enables the agent to navigate to a remote location following the natural language instruction in 3D environments. To represent the previously visited environment, most approaches for VLN implement memory…

Computer Vision and Pattern Recognition · Computer Science 2023-08-25 Zihan Wang , Xiangyang Li , Jiahao Yang , Yeqi Liu , Shuqiang Jiang

Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation

In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals. This ability can be associated with spatial…

Computer Vision and Pattern Recognition · Computer Science 2023-04-26 Pierre Marza , Laetitia Matignon , Olivier Simonin , Christian Wolf

Visual Representations for Semantic Target Driven Navigation

What is a good visual representation for autonomous agents? We address this question in the context of semantic visual navigation, which is the problem of a robot finding its way through a complex environment to a target object, e.g. go to…

Computer Vision and Pattern Recognition · Computer Science 2019-07-04 Arsalan Mousavian , Alexander Toshev , Marek Fiser , Jana Kosecka , Ayzaan Wahid , James Davidson

Multi-Agent Embodied Visual Semantic Navigation with Scene Prior Knowledge

In visual semantic navigation, the robot navigates to a target object with egocentric visual observations and the class label of the target is given. It is a meaningful task inspiring a surge of relevant research. However, most of the…

Artificial Intelligence · Computer Science 2021-09-21 Xinzhu Liu , Di Guo , Huaping Liu , Fuchun Sun

Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation

Humans navigate unfamiliar environments using episodic simulation and episodic memory, which facilitate a deeper understanding of the complex relationships between environments and objects. Developing an imaginative memory system inspired…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Yiyuan Pan , Yunzhe Xu , Zhe Liu , Hesheng Wang

MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding

Visual navigation in unknown environments based solely on natural language descriptions is a key capability for intelligent robots. In this work, we propose a navigation framework built upon off-the-shelf Visual Language Models (VLMs),…

Robotics · Computer Science 2025-08-08 Weifan Zhang , Tingguang Li , Yuzhen Liu

MaAST: Map Attention with Semantic Transformersfor Efficient Visual Navigation

Visual navigation for autonomous agents is a core task in the fields of computer vision and robotics. Learning-based methods, such as deep reinforcement learning, have the potential to outperform the classical solutions developed for this…

Computer Vision and Pattern Recognition · Computer Science 2021-03-23 Zachary Seymour , Kowshik Thopalli , Niluthpol Mithun , Han-Pang Chiu , Supun Samarasekera , Rakesh Kumar

Emergence of Maps in the Memories of Blind Navigation Agents

Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit…

Artificial Intelligence · Computer Science 2023-02-01 Erik Wijmans , Manolis Savva , Irfan Essa , Stefan Lee , Ari S. Morcos , Dhruv Batra

Bayesian Relational Memory for Semantic Visual Navigation

We introduce a new memory architecture, Bayesian Relational Memory (BRM), to improve the generalization ability for semantic visual navigation agents in unseen environments, where an agent is given a semantic target to navigate towards. BRM…

Computer Vision and Pattern Recognition · Computer Science 2019-09-11 Yi Wu , Yuxin Wu , Aviv Tamar , Stuart Russell , Georgia Gkioxari , Yuandong Tian