English
Related papers

Related papers: Bayesian Relational Memory for Semantic Visual Nav…

200 papers

Humans navigate unfamiliar environments using episodic simulation and episodic memory, which facilitate a deeper understanding of the complex relationships between environments and objects. Developing an imaginative memory system inspired…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Yiyuan Pan , Yunzhe Xu , Zhe Liu , Hesheng Wang

We introduce a new memory architecture for navigation in previously unseen environments, inspired by landmark-based navigation in animals. The proposed semi-parametric topological memory (SPTM) consists of a (non-parametric) graph with…

Machine Learning · Computer Science 2018-03-05 Nikolay Savinov , Alexey Dosovitskiy , Vladlen Koltun

Visual reprogramming (VR) leverages the intrinsic capabilities of pretrained vision models by adapting their input or output interfaces to solve downstream tasks whose labels (i.e., downstream labels) might be totally different from the…

Machine Learning · Computer Science 2024-11-01 Chengyi Cai , Zesheng Ye , Lei Feng , Jianzhong Qi , Feng Liu

In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals. This ability can be associated with spatial…

Computer Vision and Pattern Recognition · Computer Science 2023-04-26 Pierre Marza , Laetitia Matignon , Olivier Simonin , Christian Wolf

Visual navigation takes inspiration from humans, who navigate in previously unseen environments using vision without detailed environment maps. Inspired by this, we introduce a novel no-RL, no-graph, no-odometry approach to visual…

Computer Vision and Pattern Recognition · Computer Science 2024-12-16 Faith Johnson , Bryan Bo Cao , Ashwin Ashok , Shubham Jain , Kristin Dana

A novel framework is proposed to incrementally collect landmark-based graph memory and use the collected memory for image goal navigation. Given a target image to search, an embodied robot utilizes semantic memory to find the target in an…

Robotics · Computer Science 2022-09-20 Nuri Kim , Obin Kwon , Hwiyeon Yoo , Yunho Choi , Jeongho Park , Songhwai Oh

We introduce a learning-based approach for room navigation using semantic maps. Our proposed architecture learns to predict top-down belief maps of regions that lie beyond the agent's field of view while modeling architectural and stylistic…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Medhini Narasimhan , Erik Wijmans , Xinlei Chen , Trevor Darrell , Dhruv Batra , Devi Parikh , Amanpreet Singh

People navigating in unfamiliar buildings take advantage of myriad visual, spatial and semantic cues to efficiently achieve their navigation goals. Towards equipping computational agents with similar capabilities, we introduce Pathdreamer,…

Computer Vision and Pattern Recognition · Computer Science 2021-08-18 Jing Yu Koh , Honglak Lee , Yinfei Yang , Jason Baldridge , Peter Anderson

Scene recognition is a fundamental task in robotic perception. For human beings, scene recognition is reasonable because they have abundant object knowledge of the real world. The idea of transferring prior object knowledge from humans to…

Computer Vision and Pattern Recognition · Computer Science 2021-08-03 Liguang Zhou , Jun Cen , Xingchao Wang , Zhenglong Sun , Tin Lun Lam , Yangsheng Xu

Online learning via Bayes' theorem allows new data to be continuously integrated into an agent's current beliefs. However, a naive application of Bayesian methods in non stationary environments leads to slow adaptation and results in state…

Machine Learning · Computer Science 2022-02-09 Josue Nassar , Jennifer Brennan , Ben Evans , Kendall Lowrey

Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i.e., entailing an agent to navigate 3D environments through following linguistic instructions. However, current VLN agents simply…

Computer Vision and Pattern Recognition · Computer Science 2021-03-08 Hanqing Wang , Wenguan Wang , Wei Liang , Caiming Xiong , Jianbing Shen

Recent advancements in Large Language Models (LLMs) and Vision-Language Models (VLMs) have made them powerful tools in embodied navigation, enabling agents to leverage commonsense and spatial reasoning for efficient exploration in…

Delivering intelligent and adaptive navigation assistance in augmented reality (AR) requires more than visual cues, as it demands systems capable of interpreting flexible user intent and reasoning over both spatial and semantic context.…

Human-Computer Interaction · Computer Science 2025-08-26 Hsuan-Kung Yang , Tsu-Ching Hsiao , Ryoichiro Oka , Ryuya Nishino , Satoko Tofukuji , Norimasa Kobori

Human cognition excels at transcending sensory input and forming latent representations that structure our understanding of the world. While Large Language Model (LLM) agents demonstrate emergent reasoning and decision-making abilities,…

Machine Learning · Computer Science 2026-01-22 Hengguan Huang , Xing Shen , Songtao Wang , Lingfa Meng , Dianbo Liu , David Alejandro Duchene , Hao Wang , Samir Bhatt

A critical component to enabling intelligent reasoning in partially observable environments is memory. Despite this importance, Deep Reinforcement Learning (DRL) agents have so far used relatively simple memory architectures, with the main…

Machine Learning · Computer Science 2017-02-28 Emilio Parisotto , Ruslan Salakhutdinov

Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit…

Artificial Intelligence · Computer Science 2023-02-01 Erik Wijmans , Manolis Savva , Irfan Essa , Stefan Lee , Ari S. Morcos , Dhruv Batra

External memory is a key component of modern large language model (LLM) systems, enabling long-term interaction and personalization. Despite its importance, memory management is still largely driven by hand-designed heuristics, offering…

Computation and Language · Computer Science 2025-12-29 Changzhi Sun , Xiangyu Chen , Jixiang Luo , Dell Zhang , Xuelong Li

This paper presents a novel approach for the Vision-and-Language Navigation (VLN) task in continuous 3D environments, which requires an autonomous agent to follow natural language instructions in unseen environments. Existing end-to-end…

In this work, we present a memory-augmented approach for image-goal navigation. Earlier attempts, including RL-based and SLAM-based approaches have either shown poor generalization performance, or are heavily-reliant on pose/depth sensors.…

Computer Vision and Pattern Recognition · Computer Science 2023-01-06 Lina Mezghani , Sainbayar Sukhbaatar , Thibaut Lavril , Oleksandr Maksymets , Dhruv Batra , Piotr Bojanowski , Karteek Alahari

We present an approach for agents to learn representations of a global map from sensor data, to aid their exploration in new environments. To achieve this, we embed procedures mimicking that of traditional Simultaneous Localization and…

Machine Learning · Computer Science 2021-01-01 Jingwei Zhang , Lei Tai , Ming Liu , Joschka Boedecker , Wolfram Burgard
‹ Prev 1 2 3 10 Next ›