English
Related papers

Related papers: Learning Language Structures through Grounding

200 papers

Language grounding aims at linking the symbolic representation of language (e.g., words) into the rich perceptual knowledge of the outside world. The general approach is to embed both textual and visual information into a common space -the…

Computation and Language · Computer Science 2021-09-15 Hassan Shahmohammadi , Hendrik P. A. Lensch , R. Harald Baayen

Grounding language in vision is an active field of research seeking to construct cognitively plausible word and sentence representations by incorporating perceptual knowledge from vision into text-based representations. Despite many…

Computation and Language · Computer Science 2023-11-01 Hassan Shahmohammadi , Maria Heitmeier , Elnaz Shafaei-Bajestan , Hendrik P. A. Lensch , Harald Baayen

We propose a learning system in which language is grounded in visual percepts without specific pre-defined categories of terms. We present a unified generative method to acquire a shared semantic/visual embedding that enables the learning…

Computation and Language · Computer Science 2021-08-02 Nisha Pillai , Cynthia Matuszek , Francis Ferraro

We propose a weakly-supervised approach that takes image-sentence pairs as input and learns to visually ground (i.e., localize) arbitrary linguistic phrases, in the form of spatial attention masks. Specifically, the model is trained with…

Computer Vision and Pattern Recognition · Computer Science 2017-05-04 Fanyi Xiao , Leonid Sigal , Yong Jae Lee

Robots are widely collaborating with human users in diferent tasks that require high-level cognitive functions to make them able to discover the surrounding environment. A difcult challenge that we briefy highlight in this short paper is…

Computation and Language · Computer Science 2020-03-16 Amir Aly , Tadahiro Taniguchi

This survey provides an overview of the evolution of visually grounded models of spoken language over the last 20 years. Such models are inspired by the observation that when children pick up a language, they rely on a wide range of…

Artificial Intelligence · Computer Science 2022-02-22 Grzegorz Chrupała

Language grounding is an active field aiming at enriching textual representations with visual information. Generally, textual and visual elements are embedded in the same representation space, which implicitly assumes a one-to-one…

Computation and Language · Computer Science 2020-02-10 Patrick Bordes , Eloi Zablocki , Laure Soulier , Benjamin Piwowarski , Patrick Gallinari

Modern neural language models (LMs) are powerful tools for modeling human sentence production and comprehension, and their internal representations are remarkably well-aligned with representations of language in the human brain. But to…

Computation and Language · Computer Science 2024-03-27 Chengxu Zhuang , Evelina Fedorenko , Jacob Andreas

We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex…

Artificial Intelligence · Computer Science 2011-07-04 P. Gorniak , D. Roy

Key to tasks that require reasoning about natural language in visual contexts is grounding words and phrases to image regions. However, observing this grounding in contemporary models is complex, even if it is generally expected to take…

Computation and Language · Computer Science 2024-06-03 Noriyuki Kojima , Hadar Averbuch-Elor , Yoav Artzi

Humans learn language by interaction with their environment and listening to other humans. It should also be possible for computational models to learn language directly from speech but so far most approaches require text. We improve on…

Computation and Language · Computer Science 2019-09-25 Danny Merkx , Stefan L. Frank , Mirjam Ernestus

When we speak, write or listen, we continuously make predictions based on our knowledge of a language's grammar. Remarkably, children acquire this grammatical knowledge within just a few years, enabling them to understand and generalise to…

Computation and Language · Computer Science 2024-11-26 Jaap Jumelet

Visual grounding of Language aims at enriching textual representations of language with multiple sources of visual knowledge such as images and videos. Although visual grounding is an area of intense research, inter-lingual aspects of…

Computation and Language · Computer Science 2022-11-22 Wafaa Mohammed , Hassan Shahmohammadi , Hendrik P. A. Lensch , R. Harald Baayen

We are increasingly surrounded by artificially intelligent technology that takes decisions and executes actions on our behalf. This creates a pressing need for general means to communicate with, instruct and guide artificial agents, with…

Visual Grounding, also known as Referring Expression Comprehension and Phrase Grounding, aims to ground the specific region(s) within the image(s) based on the given expression text. This task simulates the common referential relationships…

Computer Vision and Pattern Recognition · Computer Science 2025-11-12 Linhui Xiao , Xiaoshan Yang , Xiangyuan Lan , Yaowei Wang , Changsheng Xu

People rely heavily on context to enrich meaning beyond what is literally said, enabling concise but effective communication. To interact successfully and naturally with people, user-facing artificial intelligence systems will require…

Computation and Language · Computer Science 2023-11-23 Daniel Fried , Nicholas Tomlin , Jennifer Hu , Roma Patel , Aida Nematzadeh

Cognitive grammar suggests that the acquisition of language grammar is grounded within visual structures. While grammar is an essential representation of natural language, it also exists ubiquitously in vision to represent the hierarchical…

Computer Vision and Pattern Recognition · Computer Science 2021-03-25 Yining Hong , Qing Li , Song-Chun Zhu , Siyuan Huang

In natural language processing, most models try to learn semantic representations merely from texts. The learned representations encode the distributional semantics but fail to connect to any knowledge about the physical world. In contrast,…

Computation and Language · Computer Science 2021-11-16 Yizhen Zhang , Minkyu Choi , Kuan Han , Zhongming Liu

Semantic parsing aims to map natural language utterances onto machine interpretable meaning representations, aka programs whose execution against a real-world environment produces a denotation. Weakly-supervised semantic parsers are trained…

Computation and Language · Computer Science 2019-09-11 Bailin Wang , Ivan Titov , Mirella Lapata

A robot's ability to understand or ground natural language instructions is fundamentally tied to its knowledge about the surrounding world. We present an approach to grounding natural language utterances in the context of factual…

Robotics · Computer Science 2018-11-19 Rohan Paul , Andrei Barbu , Sue Felshin , Boris Katz , Nicholas Roy
‹ Prev 1 2 3 10 Next ›