English
Related papers

Related papers: Grounding Language Attributes to Objects using Bay…

200 papers

As robots become more ubiquitous and capable, it becomes ever more important to enable untrained users to easily interact with them. Recently, this has led to study of the language grounding problem, where the goal is to extract…

Computation and Language · Computer Science 2012-07-03 Cynthia Matuszek , Nicholas FitzGerald , Luke Zettlemoyer , Liefeng Bo , Dieter Fox

In this work we explore how fine-grained differences between the shapes of common objects are expressed in language, grounded on images and 3D models of the objects. We first build a large scale, carefully controlled dataset of human…

Computation and Language · Computer Science 2019-05-09 Panos Achlioptas , Judy Fan , Robert X. D. Hawkins , Noah D. Goodman , Leonidas J. Guibas

The human language is one of the most natural interfaces for humans to interact with robots. This paper presents a robot system that retrieves everyday objects with unconstrained natural language descriptions. A core issue for the system is…

Robotics · Computer Science 2017-07-19 Mohit Shridhar , David Hsu

We study the problem of learning a robot policy to follow natural language instructions that can be easily extended to reason about new objects. We introduce a few-shot language-conditioned object grounding method trained from augmented…

Robotics · Computer Science 2020-11-17 Valts Blukis , Ross A. Knepper , Yoav Artzi

For robots to understand human instructions and perform meaningful tasks in the near future, it is important to develop learned models that comprehend referential language to identify common objects in real-world 3D scenes. In this paper,…

Robotics · Computer Science 2021-11-08 Junha Roh , Karthik Desingh , Ali Farhadi , Dieter Fox

Narrated instructional videos often show and describe manipulations of similar objects, e.g., repairing a particular model of a car or laptop. In this work we aim to reconstruct such objects and to localize associated narrations in 3D.…

Computer Vision and Pattern Recognition · Computer Science 2021-09-13 Dimitri Zhukov , Ignacio Rocco , Ivan Laptev , Josef Sivic , Johannes L. Schönberger , Bugra Tekin , Marc Pollefeys

Seemingly simple natural language requests to a robot are generally underspecified, for example "Can you bring me the wireless mouse?" Flat images of candidate mice may not provide the discriminative information needed for "wireless." The…

Computation and Language · Computer Science 2021-09-16 Jesse Thomason , Mohit Shridhar , Yonatan Bisk , Chris Paxton , Luke Zettlemoyer

We present a new method, PARsing And visual GrOuNding (ParaGon), for grounding natural language in object placement tasks. Natural language generally describes objects and spatial relations with compositionality and ambiguity, two major…

Robotics · Computer Science 2023-03-14 Zirui Zhao , Wee Sun Lee , David Hsu

Grounded understanding of natural language in physical scenes can greatly benefit robots that follow human instructions. In object manipulation scenarios, existing end-to-end models are proficient at understanding semantic concepts, but…

Robotics · Computer Science 2023-04-03 Qian Luo , Yunfei Li , Yi Wu

Recent advances in open-vocabulary object detection models will enable Automatic Target Recognition systems to be sustainable and repurposed by non-technical end-users for a variety of applications or missions. New, and potentially nuanced,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-24 Louis Y. Kim , Michelle Karker , Victoria Valledor , Seiyoung C. Lee , Karl F. Brzoska , Margaret Duff , Anthony Palladino

This paper presents INGRESS, a robot system that follows human natural language instructions to pick and place everyday objects. The core issue here is the grounding of referring expressions: infer objects and their relationships from input…

Robotics · Computer Science 2018-06-12 Mohit Shridhar , David Hsu

A phrase grounding system localizes a particular object in an image referred to by a natural language query. In previous work, the phrases were restricted to have nouns that were encountered in training, we extend the task to Zero-Shot…

Computer Vision and Pattern Recognition · Computer Science 2019-08-21 Arka Sadhu , Kan Chen , Ram Nevatia

Deep-learning and large scale language-image training have produced image object detectors that generalise well to diverse environments and semantic classes. However, single-image object detectors trained on internet data are not optimally…

Robotics · Computer Science 2024-02-07 Nicolas Harvey Chapman , Feras Dayoub , Will Browne , Chris Lehnert

Localizing objects in 3D scenes based on natural language requires understanding and reasoning about spatial relations. In particular, it is often crucial to distinguish similar objects referred by the text, such as "the left most chair"…

Computer Vision and Pattern Recognition · Computer Science 2022-11-18 Shizhe Chen , Pierre-Louis Guhur , Makarand Tapaswi , Cordelia Schmid , Ivan Laptev

Object Transfiguration replaces an object in an image with another object from a second image. For example it can perform tasks like "putting exactly those eyeglasses from image A on the nose of the person in image B". Usage of exemplar…

Computer Vision and Pattern Recognition · Computer Science 2017-05-16 Shuchang Zhou , Taihong Xiao , Yi Yang , Dieqiao Feng , Qinyao He , Weiran He

Most state-of-the-art semi-supervised video object segmentation methods rely on a pixel-accurate mask of a target object provided for the first frame of a video. However, obtaining a detailed segmentation mask is expensive and…

Computer Vision and Pattern Recognition · Computer Science 2019-02-06 Anna Khoreva , Anna Rohrbach , Bernt Schiele

We address the problem of jointly learning vision and language to understand the object in a fine-grained manner. The key idea of our approach is the use of object descriptions to provide the detailed understanding of an object. Based on…

Computer Vision and Pattern Recognition · Computer Science 2018-03-19 Anh Nguyen , Thanh-Toan Do , Ian Reid , Darwin G. Caldwell , Nikos G. Tsagarakis

Artificial object perception usually relies on a priori defined models and feature extraction algorithms. We study how the concept of object can be grounded in the sensorimotor experience of a naive agent. Without any knowledge about itself…

Robotics · Computer Science 2016-09-27 Alban Laflaquière , Nikolas Hemion

Grounding objects in images using visual cues is a well-established approach in computer vision, yet the potential of audio as a modality for object recognition and grounding remains underexplored. We introduce YOSS, "You Only Speak Once to…

Computer Vision and Pattern Recognition · Computer Science 2024-10-01 Wenhao Yang , Jianguo Wei , Wenhuan Lu , Lei Li

We propose a learning system in which language is grounded in visual percepts without specific pre-defined categories of terms. We present a unified generative method to acquire a shared semantic/visual embedding that enables the learning…

Computation and Language · Computer Science 2021-08-02 Nisha Pillai , Cynthia Matuszek , Francis Ferraro
‹ Prev 1 2 3 10 Next ›