Related papers: MemoNav: Working Memory Model for Visual Navigatio…

MemoNav: Selecting Informative Memories for Visual Navigation

Image-goal navigation is a challenging task, as it requires the agent to navigate to a target indicated by an image in a previously unseen scene. Current methods introduce diverse memory mechanisms which save navigation history to solve…

Computer Vision and Pattern Recognition · Computer Science 2022-08-23 Hongxin Li , Xu Yang , Yuran Yang , Shuqi Mei , Zhaoxiang Zhang

TopoNav: Topological Graphs as a Key Enabler for Advanced Object Navigation

Object Navigation (ObjectNav) has made great progress with large language models (LLMs), but still faces challenges in memory management, especially in long-horizon tasks and dynamic scenes. To address this, we propose TopoNav, a new…

Robotics · Computer Science 2025-09-03 Peiran Liu , Qiang Zhang , Daojie Peng , Lingfeng Zhang , Yihao Qin , Hang Zhou , Jun Ma , Renjing Xu , Yiding Ji

MCNav: Memory-Aware Dynamic Cognitive Map for Zero-shot Goal-oriented Navigation

Navigating to instance-level targets in complex environments is a challenging problem. Many existing zero-shot methods achieve strong performance by modeling the entire environment and leveraging large language models for scene…

Robotics · Computer Science 2026-05-20 Jingyu Li , Zhe Liu , Wenxiao Wu , Li Zhang

WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation

Object Goal Navigation-requiring an agent to locate a specific object in an unseen environment-remains a core challenge in embodied AI. Although recent progress in Vision-Language Model (VLM)-based agents has demonstrated promising…

Computer Vision and Pattern Recognition · Computer Science 2025-07-22 Dujun Nie , Xianda Guo , Yiqun Duan , Ruijun Zhang , Long Chen

SSMG-Nav: Enhancing Lifelong Object Navigation with Semantic Skeleton Memory Graph

Navigating to out-of-sight targets from human instructions in unfamiliar environments is a core capability for service robots. Despite substantial progress, most approaches underutilize reusable, persistent memory, constraining performance…

Robotics · Computer Science 2026-03-03 Haochen Niu , Lantao Zhang , Xingwu Ji , Rendong Ying , Peilin Liu , Fei Wen

FOM-Nav: Frontier-Object Maps for Object Goal Navigation

This paper addresses the Object Goal Navigation problem, where a robot must efficiently find a target object in an unknown environment. Existing implicit memory-based methods struggle with long-term memory retention and planning, while…

Robotics · Computer Science 2025-12-02 Thomas Chabal , Shizhe Chen , Jean Ponce , Cordelia Schmid

Object Memory Transformer for Object Goal Navigation

This paper presents a reinforcement learning method for object goal navigation (ObjNav) where an agent navigates in 3D indoor environments to reach a target object based on long-term observations of objects and scenes. To this end, we…

Computer Vision and Pattern Recognition · Computer Science 2022-03-29 Rui Fukushima , Kei Ota , Asako Kanezaki , Yoko Sasaki , Yusuke Yoshiyasu

OVAL: Open-Vocabulary Augmented Memory Model for Lifelong Object Goal Navigation

Object Goal Navigation (ObjectNav) refers to an agent navigating to an object in an unseen environment, which is an ability often required in the accomplishment of complex tasks. While existing methods demonstrate proficiency in isolated…

Robotics · Computer Science 2026-04-15 Jiahua Pei , Yi Liu , Guoping Pan , Yuanhao Jiang , Houde Liu , Xueqian Wang

Memory-Augmented Reinforcement Learning for Image-Goal Navigation

In this work, we present a memory-augmented approach for image-goal navigation. Earlier attempts, including RL-based and SLAM-based approaches have either shown poor generalization performance, or are heavily-reliant on pose/depth sensors.…

Computer Vision and Pattern Recognition · Computer Science 2023-01-06 Lina Mezghani , Sainbayar Sukhbaatar , Thibaut Lavril , Oleksandr Maksymets , Dhruv Batra , Piotr Bojanowski , Karteek Alahari

Image-Goal Navigation in Complex Environments via Modular Learning

We present a novel approach for image-goal navigation, where an agent navigates with a goal image rather than accurate target information, which is more challenging. Our goal is to decouple the learning of navigation goal planning,…

Robotics · Computer Science 2022-02-23 Qiaoyun Wu , Jun Wang , Jing Liang , Xiaoxi Gong , Dinesh Manocha

AstraNav-Memory: Contexts Compression for Long Memory

Lifelong embodied navigation requires agents to accumulate, retain, and exploit spatial-semantic experience across tasks, enabling efficient exploration in novel environments and rapid goal reaching in familiar ones. While object-centric…

Robotics · Computer Science 2025-12-29 Botao Ren , Junjun Hu , Xinda Xue , Minghua Luo , Jintao Chen , Haochen Bai , Liangliang You , Mu Xu

MG-Nav: Dual-Scale Visual Navigation via Sparse Spatial Memory

We present MG-Nav (Memory-Guided Navigation), a dual-scale framework for zero-shot visual navigation that unifies global memory-guided planning with local geometry-enhanced control. At its core is the Sparse Spatial Memory Graph (SMG), a…

Computer Vision and Pattern Recognition · Computer Science 2025-12-01 Bo Wang , Jiehong Lin , Chenzhi Liu , Xinting Hu , Yifei Yu , Tianjia Liu , Zhongrui Wang , Xiaojuan Qi

CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs

Object goal navigation (ObjectNav) is a fundamental task in embodied AI, requiring an agent to locate a target object in previously unseen environments. This task is particularly challenging because it requires both perceptual and cognitive…

Computer Vision and Pattern Recognition · Computer Science 2025-08-29 Yihan Cao , Jiazhao Zhang , Zhinan Yu , Shuzhen Liu , Zheng Qin , Qin Zou , Bo Du , Kai Xu

Think before Go: Hierarchical Reasoning for Image-goal Navigation

Image-goal navigation steers an agent to a target location specified by an image in unseen environments. Existing methods primarily handle this task by learning an end-to-end navigation policy, which compares the similarities of target and…

Robotics · Computer Science 2026-04-21 Pengna Li , Kangyi Wu , Shaoqing Xu , Fang Li , Lin Zhao , Long Chen , Zhi-Xin Yang , Nanning Zheng

Stop Wandering: Efficient Vision-Language Navigation via Metacognitive Reasoning

Training-free Vision-Language Navigation (VLN) agents powered by foundation models can follow instructions and explore 3D environments. However, existing approaches rely on greedy frontier selection and passive spatial memory, leading to…

Robotics · Computer Science 2026-04-03 Xueying Li , Feng Lyu , Hao Wu , Mingliu Liu , Jia-Nan Liu , Guozi Liu

Mem4Nav: Boosting Vision-and-Language Navigation in Urban Environments with a Hierarchical Spatial-Cognition Long-Short Memory System

Vision-and-Language Navigation (VLN) in large-scale urban environments requires embodied agents to ground linguistic instructions in complex scenes and recall relevant experiences over extended time horizons. Prior modular pipelines offer…

Computer Vision and Pattern Recognition · Computer Science 2025-10-13 Lixuan He , Haoyu Dong , Zhenxing Chen , Yangcheng Yu , Jie Feng , Yong Li

MaAST: Map Attention with Semantic Transformersfor Efficient Visual Navigation

Visual navigation for autonomous agents is a core task in the fields of computer vision and robotics. Learning-based methods, such as deep reinforcement learning, have the potential to outperform the classical solutions developed for this…

Computer Vision and Pattern Recognition · Computer Science 2021-03-23 Zachary Seymour , Kowshik Thopalli , Niluthpol Mithun , Han-Pang Chiu , Supun Samarasekera , Rakesh Kumar

MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation

Navigation tasks in photorealistic 3D environments are challenging because they require perception and effective planning under partial observability. Recent work shows that map-like memory is useful for long-horizon navigation tasks.…

Computer Vision and Pattern Recognition · Computer Science 2020-12-08 Saim Wani , Shivansh Patel , Unnat Jain , Angel X. Chang , Manolis Savva

ImagineNav++: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination

Visual navigation is a fundamental capability for autonomous home-assistance robots, enabling long-horizon tasks such as object search. While recent methods have leveraged Large Language Models (LLMs) to incorporate commonsense reasoning…

Robotics · Computer Science 2026-05-01 Teng Wang , Xinxin Zhao , Wenzhe Cai , Changyin Sun

Graph Attention Memory for Visual Navigation

Visual navigation in complex environments is inefficient with traditional reactive policy or general-purposed recurrent policy. To address the long-term memory issue, this paper proposes a graph attention memory (GAM) architecture…

Computer Vision and Pattern Recognition · Computer Science 2019-06-05 Dong Li , Qichao Zhang , Dongbin Zhao , Yuzheng Zhuang , Bin Wang , Wulong Liu , Rasul Tutunov , Jun Wang