English
Related papers

Related papers: Embodied Visual Recognition

200 papers

We study the task of embodied visual active learning, where an agent is set to explore a 3d environment with the goal to acquire visual scene understanding by actively selecting views for which to request annotation. While accurate on some…

Computer Vision and Pattern Recognition · Computer Science 2020-12-18 David Nilsson , Aleksis Pirinen , Erik Gärtner , Cristian Sminchisescu

We study lifelong visual perception in an embodied setup, where we develop new models and compare various agents that navigate in buildings and occasionally request annotations which, in turn, are used to refine their visual perception…

Computer Vision and Pattern Recognition · Computer Science 2021-12-30 David Nilsson , Aleksis Pirinen , Erik Gärtner , Cristian Sminchisescu

"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation. This problem has attracted rising attention in recent years due to its wide application in autonomous…

Robotics · Computer Science 2021-10-12 Fengda Zhu , Yi Zhu , Vincent CS Lee , Xiaodan Liang , Xiaojun Chang

Embodied visual tracking is to follow a target object in dynamic 3D environments using an agent's egocentric vision. This is a vital and challenging skill for embodied agents. However, existing methods suffer from inefficient training and…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Fangwei Zhong , Kui Wu , Hai Ci , Churan Wang , Hao Chen

Passive methods for object detection and segmentation treat images of the same scene as individual samples and do not exploit object permanence across multiple views. Generalization to novel or difficult viewpoints thus requires additional…

Computer Vision and Pattern Recognition · Computer Science 2021-03-30 Zhaoyuan Fang , Ayush Jain , Gabriel Sarch , Adam W. Harley , Katerina Fragkiadaki

Recent time-contrastive learning approaches manage to learn invariant object representations without supervision. This is achieved by mapping successive views of an object onto close-by internal representations. When considering this…

Machine Learning · Computer Science 2022-05-13 Arthur Aubret , Céline Teulière , Jochen Triesch

In Vision-and-Language Navigation (VLN), an embodied agent needs to reach a target destination with the only guidance of a natural language instruction. To explore the environment and progress towards the target location, the agent must…

Computer Vision and Pattern Recognition · Computer Science 2019-09-26 Federico Landi , Lorenzo Baraldi , Massimiliano Corsini , Rita Cucchiara

We introduce environment predictive coding, a self-supervised approach to learn environment-level representations for embodied agents. In contrast to prior work on self-supervised learning for images, we aim to jointly encode a series of…

Computer Vision and Pattern Recognition · Computer Science 2021-02-05 Santhosh K. Ramakrishnan , Tushar Nagarajan , Ziad Al-Halah , Kristen Grauman

Embodiment is an important characteristic for all intelligent agents (creatures and robots), while existing scene description tasks mainly focus on analyzing images passively and the semantic understanding of the scenario is separated from…

Robotics · Computer Science 2020-05-08 Sinan Tan , Huaping Liu , Di Guo , Xinyu Zhang , Fuchun Sun

When a robot encounters a novel object, how should it respond$\unicode{x2014}$what data should it collect$\unicode{x2014}$so that it can find the object in the future? In this work, we present a method for learning image features of an…

Robotics · Computer Science 2024-10-16 Allison Pinosky , Todd D. Murphey

Object manipulation is a critical skill required for Embodied AI agents interacting with the world around them. Training agents to manipulate objects, poses many challenges. These include occlusion of the target object by the agent's arm,…

Computer Vision and Pattern Recognition · Computer Science 2022-03-16 Kiana Ehsani , Ali Farhadi , Aniruddha Kembhavi , Roozbeh Mottaghi

Embodied computer vision considers perception for robots in novel, unstructured environments. Of particular importance is the embodied visual exploration problem: how might a robot equipped with a camera scope out a new environment? Despite…

Computer Vision and Pattern Recognition · Computer Science 2020-08-24 Santhosh K. Ramakrishnan , Dinesh Jayaraman , Kristen Grauman

The development of embodied agents that can communicate with humans in natural language has gained increasing interest over the last years, as it facilitates the diffusion of robotic platforms in human-populated environments. As a step…

Robotics · Computer Science 2024-04-16 Roberto Bigazzi , Marcella Cornia , Silvia Cascianelli , Lorenzo Baraldi , Rita Cucchiara

A crucial ability of mobile intelligent agents is to integrate the evidence from multiple sensory inputs in an environment and to make a sequence of actions to reach their goals. In this paper, we attempt to approach the problem of…

Computer Vision and Pattern Recognition · Computer Science 2020-03-10 Chuang Gan , Yiwei Zhang , Jiajun Wu , Boqing Gong , Joshua B. Tenenbaum

Embodied scene understanding serves as the cornerstone for autonomous agents to perceive, interpret, and respond to open driving scenarios. Such understanding is typically founded upon Vision-Language Models (VLMs). Nevertheless, existing…

Computer Vision and Pattern Recognition · Computer Science 2024-03-08 Yunsong Zhou , Linyan Huang , Qingwen Bu , Jia Zeng , Tianyu Li , Hang Qiu , Hongzi Zhu , Minyi Guo , Yu Qiao , Hongyang Li

Embodied scene understanding requires not only comprehending visual-spatial information that has been observed but also determining where to explore next in the 3D physical world. Existing 3D Vision-Language (3D-VL) models primarily focus…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Ziyu Zhu , Xilin Wang , Yixuan Li , Zhuofan Zhang , Xiaojian Ma , Yixin Chen , Baoxiong Jia , Wei Liang , Qian Yu , Zhidong Deng , Siyuan Huang , Qing Li

This paper describes our research on AI agents embodied in visual, virtual or physical forms, enabling them to interact with both users and their environments. These agents, which include virtual avatars, wearable devices, and robots, are…

Recent years have seen embodied visual navigation advance in two distinct directions: (i) in equipping the AI agent to follow natural language instructions, and (ii) in making the navigable world multimodal, e.g., audio-visual navigation.…

Computer Vision and Pattern Recognition · Computer Science 2022-10-17 Sudipta Paul , Amit K. Roy-Chowdhury , Anoop Cherian

How should we learn visual representations for embodied agents that must see and move? The status quo is tabula rasa in vivo, i.e. learning visual representations from scratch while also learning to move, potentially augmented with…

Computer Vision and Pattern Recognition · Computer Science 2022-04-29 Karmesh Yadav , Ram Ramrakhya , Arjun Majumdar , Vincent-Pierre Berges , Sachit Kuhar , Dhruv Batra , Alexei Baevski , Oleksandr Maksymets

This paper investigates the problem of understanding dynamic 3D scenes from egocentric observations, a key challenge in robotics and embodied AI. Unlike prior studies that explored this as long-form video understanding and utilized…

Computer Vision and Pattern Recognition · Computer Science 2025-01-10 Yue Fan , Xiaojian Ma , Rongpeng Su , Jun Guo , Rujie Wu , Xi Chen , Qing Li
‹ Prev 1 2 3 10 Next ›