Related papers: Bayesian Relational Memory for Semantic Visual Nav…

Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation

Humans navigate unfamiliar environments using episodic simulation and episodic memory, which facilitate a deeper understanding of the complex relationships between environments and objects. Developing an imaginative memory system inspired…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Yiyuan Pan , Yunzhe Xu , Zhe Liu , Hesheng Wang

Semi-parametric Topological Memory for Navigation

We introduce a new memory architecture for navigation in previously unseen environments, inspired by landmark-based navigation in animals. The proposed semi-parametric topological memory (SPTM) consists of a (non-parametric) graph with…

Machine Learning · Computer Science 2018-03-05 Nikolay Savinov , Alexey Dosovitskiy , Vladlen Koltun

Bayesian-guided Label Mapping for Visual Reprogramming

Visual reprogramming (VR) leverages the intrinsic capabilities of pretrained vision models by adapting their input or output interfaces to solve downstream tasks whose labels (i.e., downstream labels) might be totally different from the…

Machine Learning · Computer Science 2024-11-01 Chengyi Cai , Zesheng Ye , Lei Feng , Jianzhong Qi , Feng Liu

Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation

In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals. This ability can be associated with spatial…

Computer Vision and Pattern Recognition · Computer Science 2023-04-26 Pierre Marza , Laetitia Matignon , Olivier Simonin , Christian Wolf

Memory Proxy Maps for Visual Navigation

Visual navigation takes inspiration from humans, who navigate in previously unseen environments using vision without detailed environment maps. Inspired by this, we introduce a novel no-RL, no-graph, no-odometry approach to visual…

Computer Vision and Pattern Recognition · Computer Science 2024-12-16 Faith Johnson , Bryan Bo Cao , Ashwin Ashok , Shubham Jain , Kristin Dana

Topological Semantic Graph Memory for Image-Goal Navigation

A novel framework is proposed to incrementally collect landmark-based graph memory and use the collected memory for image goal navigation. Given a target image to search, an embodied robot utilizes semantic memory to find the target in an…

Robotics · Computer Science 2022-09-20 Nuri Kim , Obin Kwon , Hwiyeon Yoo , Yunho Choi , Jeongho Park , Songhwai Oh

Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation

We introduce a learning-based approach for room navigation using semantic maps. Our proposed architecture learns to predict top-down belief maps of regions that lie beyond the agent's field of view while modeling architectural and stylistic…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Medhini Narasimhan , Erik Wijmans , Xinlei Chen , Trevor Darrell , Dhruv Batra , Devi Parikh , Amanpreet Singh

Pathdreamer: A World Model for Indoor Navigation

People navigating in unfamiliar buildings take advantage of myriad visual, spatial and semantic cues to efficiently achieve their navigation goals. Towards equipping computational agents with similar capabilities, we introduce Pathdreamer,…

Computer Vision and Pattern Recognition · Computer Science 2021-08-18 Jing Yu Koh , Honglak Lee , Yinfei Yang , Jason Baldridge , Peter Anderson

BORM: Bayesian Object Relation Model for Indoor Scene Recognition

Scene recognition is a fundamental task in robotic perception. For human beings, scene recognition is reasonable because they have abundant object knowledge of the real world. The idea of transferring prior object knowledge from humans to…

Computer Vision and Pattern Recognition · Computer Science 2021-08-03 Liguang Zhou , Jun Cen , Xingchao Wang , Zhenglong Sun , Tin Lun Lam , Yangsheng Xu

BAM: Bayes with Adaptive Memory

Online learning via Bayes' theorem allows new data to be continuously integrated into an agent's current beliefs. However, a naive application of Bayesian methods in non stationary environments leads to slow adaptation and results in state…

Machine Learning · Computer Science 2022-02-09 Josue Nassar , Jennifer Brennan , Ben Evans , Kendall Lowrey

Structured Scene Memory for Vision-Language Navigation

Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i.e., entailing an agent to navigate 3D environments through following linguistic instructions. However, current VLN agents simply…

Computer Vision and Pattern Recognition · Computer Science 2021-03-08 Hanqing Wang , Wenguan Wang , Wei Liang , Caiming Xiong , Jianbing Shen

Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation

Recent advancements in Large Language Models (LLMs) and Vision-Language Models (VLMs) have made them powerful tools in embodied navigation, enabling agents to leverage commonsense and spatial reasoning for efficient exploration in…

Robotics · Computer Science 2025-06-12 Lingfeng Zhang , Yuecheng Liu , Zhanguang Zhang , Matin Aghaei , Yaochen Hu , Hongjian Gu , Mohammad Ali Alomrani , David Gamaliel Arcos Bravo , Raika Karimi , Atia Hamidizadeh , Haoping Xu , Guowei Huang , Zhanpeng Zhang , Tongtong Cao , Weichao Qiu , Xingyue Quan , Jianye Hao , Yuzheng Zhuang , Yingxue Zhang

An Embodied AR Navigation Agent: Integrating BIM with Retrieval-Augmented Generation for Language Guidance

Delivering intelligent and adaptive navigation assistance in augmented reality (AR) requires more than visual cues, as it demands systems capable of interpreting flexible user intent and reasoning over both spatial and semantic context.…

Human-Computer Interaction · Computer Science 2025-08-26 Hsuan-Kung Yang , Tsu-Ching Hsiao , Ryoichiro Oka , Ryuya Nishino , Satoko Tofukuji , Norimasa Kobori

BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling

Human cognition excels at transcending sensory input and forming latent representations that structure our understanding of the world. While Large Language Model (LLM) agents demonstrate emergent reasoning and decision-making abilities,…

Machine Learning · Computer Science 2026-01-22 Hengguan Huang , Xing Shen , Songtao Wang , Lingfa Meng , Dianbo Liu , David Alejandro Duchene , Hao Wang , Samir Bhatt

Neural Map: Structured Memory for Deep Reinforcement Learning

A critical component to enabling intelligent reasoning in partially observable environments is memory. Despite this importance, Deep Reinforcement Learning (DRL) agents have so far used relatively simple memory architectures, with the main…

Machine Learning · Computer Science 2017-02-28 Emilio Parisotto , Ruslan Salakhutdinov

Emergence of Maps in the Memories of Blind Navigation Agents

Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit…

Artificial Intelligence · Computer Science 2023-02-01 Erik Wijmans , Manolis Savva , Irfan Essa , Stefan Lee , Ari S. Morcos , Dhruv Batra

Beyond Heuristics: A Decision-Theoretic Framework for Agent Memory Management

External memory is a key component of modern large language model (LLM) systems, enabling long-term interaction and personalization. Despite its importance, memory management is still largely driven by hand-designed heuristics, offering…

Computation and Language · Computer Science 2025-12-29 Changzhi Sun , Xiangyu Chen , Jixiang Luo , Dell Zhang , Xuelong Li

SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments

This paper presents a novel approach for the Vision-and-Language Navigation (VLN) task in continuous 3D environments, which requires an autonomous agent to follow natural language instructions in unseen environments. Existing end-to-end…

Robotics · Computer Science 2021-08-27 Muhammad Zubair Irshad , Niluthpol Chowdhury Mithun , Zachary Seymour , Han-Pang Chiu , Supun Samarasekera , Rakesh Kumar

Memory-Augmented Reinforcement Learning for Image-Goal Navigation

In this work, we present a memory-augmented approach for image-goal navigation. Earlier attempts, including RL-based and SLAM-based approaches have either shown poor generalization performance, or are heavily-reliant on pose/depth sensors.…

Computer Vision and Pattern Recognition · Computer Science 2023-01-06 Lina Mezghani , Sainbayar Sukhbaatar , Thibaut Lavril , Oleksandr Maksymets , Dhruv Batra , Piotr Bojanowski , Karteek Alahari

Neural SLAM: Learning to Explore with External Memory

We present an approach for agents to learn representations of a global map from sensor data, to aid their exploration in new environments. To achieve this, we embed procedures mimicking that of traditional Simultaneous Localization and…

Machine Learning · Computer Science 2021-01-01 Jingwei Zhang , Lei Tai , Ming Liu , Joschka Boedecker , Wolfram Burgard