Related papers: VLPG-Nav: Object Navigation Using Visual Language …

VLG-Loc: Vision-Language Global Localization from Labeled Footprint Maps

This paper presents Vision-Language Global Localization (VLG-Loc), a novel global localization method that uses human-readable labeled footprint maps containing only names and areas of distinctive visual landmarks in an environment. While…

Robotics · Computer Science 2025-12-19 Mizuho Aoki , Kohei Honda , Yasuhiro Yoshimura , Takeshi Ishita , Ryo Yonetani

Navi2Gaze: Leveraging Foundation Models for Navigation and Target Gazing

Task-aware navigation continues to be a challenging area of research, especially in scenarios involving open vocabulary. Previous studies primarily focus on finding suitable locations for task completion, often overlooking the importance of…

Robotics · Computer Science 2024-09-18 Jun Zhu , Zihao Du , Haotian Xu , Fengbo Lan , Zilong Zheng , Bo Ma , Shengjie Wang , Tao Zhang

MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding

Visual navigation in unknown environments based solely on natural language descriptions is a key capability for intelligent robots. In this work, we propose a navigation framework built upon off-the-shelf Visual Language Models (VLMs),…

Robotics · Computer Science 2025-08-08 Weifan Zhang , Tingguang Li , Yuzhen Liu

Visual Language Maps for Robot Navigation

Grounding language to the visual observations of a navigating agent can be performed using off-the-shelf visual-language models pretrained on Internet-scale data (e.g., image captions). While this is useful for matching images to natural…

Robotics · Computer Science 2023-03-09 Chenguang Huang , Oier Mees , Andy Zeng , Wolfram Burgard

IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation

Vision-and-Language Navigation (VLN) is a challenging task that requires a robot to navigate in photo-realistic environments with human natural language promptings. Recent studies aim to handle this task by constructing the semantic spatial…

Computer Vision and Pattern Recognition · Computer Science 2024-03-29 Jiacui Huang , Hongtao Zhang , Mingbo Zhao , Zhou Wu

Coupling Vision and Proprioception for Navigation of Legged Robots

We exploit the complementary strengths of vision and proprioception to develop a point-goal navigation system for legged robots, called VP-Nav. Legged systems are capable of traversing more complex terrain than wheeled robots, but to fully…

Robotics · Computer Science 2022-07-26 Zipeng Fu , Ashish Kumar , Ananye Agarwal , Haozhi Qi , Jitendra Malik , Deepak Pathak

Probabilistic Visual Navigation with Bidirectional Image Prediction

Humans can robustly follow a visual trajectory defined by a sequence of images (i.e. a video) regardless of substantial changes in the environment or the presence of obstacles. We aim at endowing similar visual navigation capabilities to…

Robotics · Computer Science 2022-02-21 Noriaki Hirose , Shun Taguchi , Fei Xia , Roberto Martin-Martin , Kosuke Tahara , Masanori Ishigaki , Silvio Savarese

Position-guided Text Prompt for Vision-Language Pre-training

Vision-Language Pre-Training (VLP) has shown promising capabilities to align image and text pairs, facilitating a broad variety of cross-modal learning tasks. However, we observe that VLP models often lack the visual grounding/localization…

Computer Vision and Pattern Recognition · Computer Science 2023-06-08 Alex Jinpeng Wang , Pan Zhou , Mike Zheng Shou , Shuicheng Yan

Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation Using Vision Language Models

Visual target navigation is a critical capability for autonomous robots operating in unknown environments, particularly in human-robot interaction scenarios. While classical and learning-based methods have shown promise, most existing…

Robotics · Computer Science 2025-05-07 Bangguo Yu , Qihao Yuan , Kailai Li , Hamidreza Kasaei , Ming Cao

Improving Visual Place Recognition Based Robot Navigation By Verifying Localization Estimates

Visual Place Recognition (VPR) systems often have imperfect performance, affecting the `integrity' of position estimates and subsequent robot navigation decisions. Previously, SVM classifiers have been used to monitor VPR integrity. This…

Computer Vision and Pattern Recognition · Computer Science 2024-11-20 Owen Claxton , Connor Malone , Helen Carson , Jason Ford , Gabe Bolton , Iman Shames , Michael Milford

VLM-Guided Visual Place Recognition for Planet-Scale Geo-Localization

Geo-localization from a single image at planet scale (essentially an advanced or extreme version of the kidnapped robot problem) is a fundamental and challenging task in applications such as navigation, autonomous driving and disaster…

Computer Vision and Pattern Recognition · Computer Science 2025-07-24 Sania Waheed , Na Min An , Michael Milford , Sarvapali D. Ramchurn , Shoaib Ehsan

Grounding Scene Graphs on Natural Images via Visio-Lingual Message Passing

This paper presents a framework for jointly grounding objects that follow certain semantic relationship constraints given in a scene graph. A typical natural scene contains several objects, often exhibiting visual relationships of varied…

Computer Vision and Pattern Recognition · Computer Science 2022-11-04 Aditay Tripathi , Anand Mishra , Anirban Chakraborty

Multi-Mobile Robot Localization and Navigation based on Visible Light Positioning

We demonstrated multi-mobile robot navigation based on Visible Light Positioning(VLP) localization. From our experiment, the VLP can accurately locate robots' positions in navigation.

Robotics · Computer Science 2022-11-02 Yanyi Chen , Zhiqing Zhong , Shangsheng Wen , Weipeng Guan

ViNG: Learning Open-World Navigation with Visual Goals

We propose a learning-based navigation system for reaching visually indicated goals and demonstrate this system on a real mobile robot platform. Learning provides an appealing alternative to conventional methods for robotic navigation:…

Robotics · Computer Science 2022-10-11 Dhruv Shah , Benjamin Eysenbach , Gregory Kahn , Nicholas Rhinehart , Sergey Levine

VL-Grasp: a 6-Dof Interactive Grasp Policy for Language-Oriented Objects in Cluttered Indoor Scenes

Robotic grasping faces new challenges in human-robot-interaction scenarios. We consider the task that the robot grasps a target object designated by human's language directives. The robot not only needs to locate a target based on…

Robotics · Computer Science 2023-08-02 Yuhao Lu , Yixuan Fan , Beixing Deng , Fangfu Liu , Yali Li , Shengjin Wang

OVExp: Open Vocabulary Exploration for Object-Oriented Navigation

Object-oriented embodied navigation aims to locate specific objects, defined by category or depicted in images. Existing methods often struggle to generalize to open vocabulary goals without extensive training data. While recent advances in…

Robotics · Computer Science 2024-07-15 Meng Wei , Tai Wang , Yilun Chen , Hanqing Wang , Jiangmiao Pang , Xihui Liu

Visual Place Recognition: A Tutorial

Localization is an essential capability for mobile robots. A rapidly growing field of research in this area is Visual Place Recognition (VPR), which is the ability to recognize previously seen places in the world based solely on images.…

Computer Vision and Pattern Recognition · Computer Science 2023-10-31 Stefan Schubert , Peer Neubert , Sourav Garg , Michael Milford , Tobias Fischer

Vision-Based Localization and LLM-based Navigation for Indoor Environments

Indoor navigation remains a complex challenge due to the absence of reliable GPS signals and the architectural intricacies of large enclosed environments. This study presents an indoor localization and navigation approach that integrates…

Machine Learning · Computer Science 2025-08-12 Keyan Rahimi , Md. Wasiul Haque , Sagar Dasgupta , Mizanur Rahman

Context-Based Visual-Language Place Recognition

In vision-based robot localization and SLAM, Visual Place Recognition (VPR) is essential. This paper addresses the problem of VPR, which involves accurately recognizing the location corresponding to a given query image. A popular approach…

Robotics · Computer Science 2024-10-28 Soojin Woo , Seong-Woo Kim

Vision-based Robotic Grasping From Object Localization, Object Pose Estimation to Grasp Estimation for Parallel Grippers: A Review

This paper presents a comprehensive survey on vision-based robotic grasping. We conclude three key tasks during vision-based robotic grasping, which are object localization, object pose estimation and grasp estimation. In detail, the object…

Robotics · Computer Science 2020-12-24 Guoguang Du , Kai Wang , Shiguo Lian , Kaiyong Zhao