Related papers: Guided Navigation from Multiple Viewpoints using Q…

Multimodal Perception for Goal-oriented Navigation: A Survey

Goal-oriented navigation presents a fundamental challenge for autonomous systems, requiring agents to navigate complex environments to reach designated targets. This survey offers a comprehensive analysis of multimodal navigation approaches…

Robotics · Computer Science 2025-04-23 I-Tak Ieong , Hao Tang

Deep Learning for Embodied Vision Navigation: A Survey

"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation. This problem has attracted rising attention in recent years due to its wide application in autonomous…

Robotics · Computer Science 2021-10-12 Fengda Zhu , Yi Zhu , Vincent CS Lee , Xiaodan Liang , Xiaojun Chang

Towards self-attention based visual navigation in the real world

Vision guided navigation requires processing complex visual information to inform task-orientated decisions. Applications include autonomous robots, self-driving cars, and assistive vision for humans. A key element is the extraction and…

Robotics · Computer Science 2022-09-20 Jaime Ruiz-Serra , Jack White , Stephen Petrie , Tatiana Kameneva , Chris McCarthy

Visual Representations for Semantic Target Driven Navigation

What is a good visual representation for autonomous agents? We address this question in the context of semantic visual navigation, which is the problem of a robot finding its way through a complex environment to a target object, e.g. go to…

Computer Vision and Pattern Recognition · Computer Science 2019-07-04 Arsalan Mousavian , Alexander Toshev , Marek Fiser , Jana Kosecka , Ayzaan Wahid , James Davidson

Towards Navigation by Reasoning over Spatial Configurations

We deal with the navigation problem where the agent follows natural language instructions while observing the environment. Focusing on language understanding, we show the importance of spatial semantics in grounding navigation instructions…

Computation and Language · Computer Science 2021-05-17 Yue Zhang , Quan Guo , Parisa Kordjamshidi

Multi-Agent Embodied Visual Semantic Navigation with Scene Prior Knowledge

In visual semantic navigation, the robot navigates to a target object with egocentric visual observations and the class label of the target is given. It is a meaningful task inspiring a surge of relevant research. However, most of the…

Artificial Intelligence · Computer Science 2021-09-21 Xinzhu Liu , Di Guo , Huaping Liu , Fuchun Sun

An in-depth experimental study of sensor usage and visual reasoning of robots navigating in real environments

Visual navigation by mobile robots is classically tackled through SLAM plus optimal planning, and more recently through end-to-end training of policies implemented as deep networks. While the former are often limited to waypoint planning,…

Artificial Intelligence · Computer Science 2021-11-30 Assem Sadek , Guillaume Bono , Boris Chidlovskii , Christian Wolf

Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation

In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals. This ability can be associated with spatial…

Computer Vision and Pattern Recognition · Computer Science 2023-04-26 Pierre Marza , Laetitia Matignon , Olivier Simonin , Christian Wolf

Learning Active Camera for Multi-Object Navigation

Getting robots to navigate to multiple objects autonomously is essential yet difficult in robot applications. One of the key challenges is how to explore environments efficiently with camera sensors only. Existing navigation methods mainly…

Computer Vision and Pattern Recognition · Computer Science 2022-10-17 Peihao Chen , Dongyu Ji , Kunyang Lin , Weiwen Hu , Wenbing Huang , Thomas H. Li , Mingkui Tan , Chuang Gan

One Agent to Guide Them All: Empowering MLLMs for Vision-and-Language Navigation via Explicit World Representation

A navigable agent needs to understand both high-level semantic instructions and precise spatial perceptions. Building navigation agents centered on Multimodal Large Language Models (MLLMs) demonstrates a promising solution due to their…

Robotics · Computer Science 2026-02-18 Zerui Li , Hongpei Zheng , Fangguo Zhao , Aidan Chan , Jian Zhou , Sihao Lin , Shijie Li , Qi Wu

Navigating to Objects in Unseen Environments by Distance Prediction

Object Goal Navigation (ObjectNav) task is to navigate an agent to an object category in unseen environments without a pre-built map. In this paper, we solve this task by predicting the distance to the target using semantically-related…

Robotics · Computer Science 2022-07-14 Minzhao Zhu , Binglei Zhao , Tao Kong

See What the Robot Can't See: Learning Cooperative Perception for Visual Navigation

We consider the problem of navigating a mobile robot towards a target in an unknown environment that is endowed with visual sensors, where neither the robot nor the sensors have access to global positioning information and only use…

Robotics · Computer Science 2023-08-01 Jan Blumenkamp , Qingbiao Li , Binyu Wang , Zhe Liu , Amanda Prorok

Optimizing Gaze Direction in a Visual Navigation Task

Navigation in an unknown environment consists of multiple separable subtasks, such as collecting information about the surroundings and navigating to the current goal. In the case of pure visual navigation, all these subtasks need to…

Robotics · Computer Science 2016-02-17 Tuomas Välimäki , Risto Ritala

Multi Agent Navigation in Unconstrained Environments using a Centralized Attention based Graphical Neural Network Controller

In this work, we propose a learning based neural model that provides both the longitudinal and lateral control commands to simultaneously navigate multiple vehicles. The goal is to ensure that each vehicle reaches a desired target state…

Robotics · Computer Science 2024-03-22 Yining Ma , Qadeer Khan , Daniel Cremers

Combining Optimal Control and Learning for Visual Navigation in Novel Environments

Model-based control is a popular paradigm for robot navigation because it can leverage a known dynamics model to efficiently plan robust robot trajectories. However, it is challenging to use model-based methods in settings where the…

Robotics · Computer Science 2019-07-19 Somil Bansal , Varun Tolani , Saurabh Gupta , Jitendra Malik , Claire Tomlin

Navigation beyond Wayfinding: Robots Collaborating with Visually Impaired Users for Environmental Interactions

Robotic guidance systems have shown promise in supporting blind and visually impaired (BVI) individuals with wayfinding and obstacle avoidance. However, most existing systems assume a clear path and do not support a critical aspect of…

Robotics · Computer Science 2026-03-17 Shaojun Cai , Nuwan Janaka , Ashwin Ram , Janidu Shehan , Yingjia Wan , Kotaro Hara , David Hsu

Understanding and Imitating Human-Robot Motion with Restricted Visual Fields

When working around other agents such as humans, it is important to model their perception capabilities to predict and make sense of their behavior. In this work, we consider agents whose perception capabilities are determined by their…

Robotics · Computer Science 2025-08-12 Maulik Bhatt , HongHao Zhen , Monroe Kennedy , Negar Mehr

Learning Semantic-Agnostic and Spatial-Aware Representation for Generalizable Visual-Audio Navigation

Visual-audio navigation (VAN) is attracting more and more attention from the robotic community due to its broad applications, \emph{e.g.}, household robots and rescue robots. In this task, an embodied agent must search for and navigate to…

Robotics · Computer Science 2023-06-22 Hongcheng Wang , Yuxuan Wang , Fangwei Zhong , Mingdong Wu , Jianwei Zhang , Yizhou Wang , Hao Dong

Visual Navigation with Spatial Attention

This work focuses on object goal visual navigation, aiming at finding the location of an object from a given class, where in each step the agent is provided with an egocentric RGB image of the scene. We propose to learn the agent's policy…

Computer Vision and Pattern Recognition · Computer Science 2021-04-21 Bar Mayo , Tamir Hazan , Ayellet Tal

Advancing Autonomous Driving Perception: Analysis of Sensor Fusion and Computer Vision Techniques

In autonomous driving, perception systems are piv otal as they interpret sensory data to understand the envi ronment, which is essential for decision-making and planning. Ensuring the safety of these perception systems is fundamental for…

Robotics · Computer Science 2024-11-19 Urvishkumar Bharti , Vikram Shahapur