English
Related papers

Related papers: Efficient Multi-Modal Embeddings from Structured D…

200 papers

Distributional semantic models capture word-level meaning that is useful in many natural language processing tasks and have even been shown to capture cognitive aspects of word meaning. The majority of these models are purely text based,…

Computation and Language · Computer Science 2022-03-31 Danny Merkx , Stefan L. Frank , Mirjam Ernestus

Representing the semantics of words is a long-standing problem for the natural language processing community. Most methods compute word semantics given their textual context in large corpora. More recently, researchers attempted to…

Computation and Language · Computer Science 2017-11-10 Éloi Zablocki , Benjamin Piwowarski , Laure Soulier , Patrick Gallinari

Multimodal embeddings aim to enrich the semantic information in neural representations of language compared to text-only models. While different embeddings exhibit different applicability and performance on downstream tasks, little is known…

Computation and Language · Computer Science 2023-06-06 Aleksey Tikhonov , Lisa Bylinina , Denis Paperno

We propose a novel approach to improve a visual-semantic embedding model by incorporating concept representations captured from an external structured knowledge base. We investigate its performance on image classification under both…

Computer Vision and Pattern Recognition · Computer Science 2020-09-22 Mirantha Jayathilaka , Tingting Mu , Uli Sattler

Current approaches to learning semantic representations of sentences often use prior word-level knowledge. The current study aims to leverage visual information in order to capture sentence level semantics without the need for word…

Computation and Language · Computer Science 2019-09-25 Danny Merkx , Stefan Frank

Numerous embedding models have been recently explored to incorporate semantic knowledge into visual recognition. Existing methods typically focus on minimizing the distance between the corresponding images and texts in the embedding space…

Computer Vision and Pattern Recognition · Computer Science 2017-06-06 Dong Li , Hsin-Ying Lee , Jia-Bin Huang , Shengjin Wang , Ming-Hsuan Yang

Language grounding aims at linking the symbolic representation of language (e.g., words) into the rich perceptual knowledge of the outside world. The general approach is to embed both textual and visual information into a common space -the…

Computation and Language · Computer Science 2021-09-15 Hassan Shahmohammadi , Hendrik P. A. Lensch , R. Harald Baayen

Semantic embeddings have advanced the state of the art for countless natural language processing tasks, and various extensions to multimodal domains, such as visual-semantic embeddings, have been proposed. While the power of visual-semantic…

Machine Learning · Computer Science 2021-02-23 Adam Dahlgren Lindström , Suna Bensch , Johanna Björklund , Frank Drewes

Despite significant progress in multimodal language models (LMs), it remains unclear whether visual grounding enhances their understanding of embodied knowledge compared to text-only models. To address this question, we propose a novel…

Computation and Language · Computer Science 2025-10-21 Zhihui Yang , Yupei Wang , Kaijie Mo , Zhe Zhao , Renfen Hu

Visual grounding of Language aims at enriching textual representations of language with multiple sources of visual knowledge such as images and videos. Although visual grounding is an area of intense research, inter-lingual aspects of…

Computation and Language · Computer Science 2022-11-22 Wafaa Mohammed , Hassan Shahmohammadi , Hendrik P. A. Lensch , R. Harald Baayen

Language grounding is an active field aiming at enriching textual representations with visual information. Generally, textual and visual elements are embedded in the same representation space, which implicitly assumes a one-to-one…

Computation and Language · Computer Science 2020-02-10 Patrick Bordes , Eloi Zablocki , Laure Soulier , Benjamin Piwowarski , Patrick Gallinari

Multi-modal visual understanding of images with prompts involves using various visual and textual cues to enhance the semantic understanding of images. This approach combines both vision and language processing to generate more accurate…

Computer Vision and Pattern Recognition · Computer Science 2023-05-17 Yuzhou Peng

Biological research has revealed that the verbal semantic information in the brain cortex, as an additional source, participates in nonverbal semantic tasks, such as visual encoding. However, previous visual encoding models did not…

Computer Vision and Pattern Recognition · Computer Science 2023-08-30 Shuxiao Ma , Linyuan Wang , Bin Yan

Human-annotated attributes serve as powerful semantic embeddings in zero-shot learning. However, their annotation process is labor-intensive and needs expert supervision. Current unsupervised semantic embeddings, i.e., word embeddings,…

Computer Vision and Pattern Recognition · Computer Science 2023-05-29 Wenjia Xu , Yongqin Xian , Jiuniu Wang , Bernt Schiele , Zeynep Akata

Distributional models provide a convenient way to model semantics using dense embedding spaces derived from unsupervised learning algorithms. However, the dimensions of dense embedding spaces are not designed to resemble human semantic…

Computation and Language · Computer Science 2018-11-15 Steven Derby , Paul Miller , Brian Murphy , Barry Devereux

This study evaluates the effectiveness of Vision Language Models (VLMs) in representing and utilizing multimodal content for fact-checking. To be more specific, we investigate whether incorporating multimodal content improves performance…

Computation and Language · Computer Science 2024-12-09 Recep Firat Cekinel , Pinar Karagoz , Cagri Coltekin

We present our work in progress exploring the possibilities of a shared embedding space between textual and visual modality. Leveraging the textual nature of object detection labels and the hypothetical expressiveness of extracted visual…

Computer Vision and Pattern Recognition · Computer Science 2020-10-21 Dušan Variš , Katsuhito Sudoh , Satoshi Nakamura

Word embeddings such as ELMo have recently been shown to model word semantics with greater efficacy through contextualized learning on large-scale language corpora, resulting in significant improvement in state of the art across many…

Computation and Language · Computer Science 2019-09-11 Shao-Yen Tseng , Panayiotis Georgiou , Shrikanth Narayanan

Embeddings play an important role in end-to-end solutions for multi-modal language processing problems. Although there has been some effort to understand the properties of single-modality embedding spaces, particularly that of text, their…

Computation and Language · Computer Science 2023-01-20 Muhammad Huzaifah , Ivan Kukanov

Grounding language in vision is an active field of research seeking to construct cognitively plausible word and sentence representations by incorporating perceptual knowledge from vision into text-based representations. Despite many…

Computation and Language · Computer Science 2023-11-01 Hassan Shahmohammadi , Maria Heitmeier , Elnaz Shafaei-Bajestan , Hendrik P. A. Lensch , Harald Baayen
‹ Prev 1 2 3 10 Next ›