Related papers: Multilingual Word Embeddings using Multigraphs

Unsupervised Cross-lingual Word Embedding by Multilingual Neural Language Models

We propose an unsupervised method to obtain cross-lingual embeddings without any parallel data or pre-trained word embeddings. The proposed model, which we call multilingual neural language models, takes sentences of multiple languages as…

Computation and Language · Computer Science 2018-09-10 Takashi Wada , Tomoharu Iwata

Embedding Word Similarity with Neural Machine Translation

Neural language models learn word representations, or embeddings, that capture rich linguistic and conceptual information. Here we investigate the embeddings learned by neural machine translation models, a recently-developed class of neural…

Computation and Language · Computer Science 2015-04-06 Felix Hill , Kyunghyun Cho , Sebastien Jean , Coline Devin , Yoshua Bengio

Multilingual Models for Compositional Distributed Semantics

We present a novel technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of…

Computation and Language · Computer Science 2014-04-21 Karl Moritz Hermann , Phil Blunsom

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

Word embeddings, which represent a word as a point in a vector space, have become ubiquitous to several NLP tasks. A recent line of work uses bilingual (two languages) corpora to learn a different vector for each sense of a word, by…

Computation and Language · Computer Science 2017-06-27 Shyam Upadhyay , Kai-Wei Chang , Matt Taddy , Adam Kalai , James Zou

Learning Multilingual Word Embeddings Using Image-Text Data

There has been significant interest recently in learning multilingual word embeddings -- in which semantically similar words across languages have similar embeddings. State-of-the-art approaches have relied on expensive labeled data, which…

Computation and Language · Computer Science 2020-07-02 Karan Singhal , Karthik Raman , Balder ten Cate

Learning to Represent Bilingual Dictionaries

Bilingual word embeddings have been widely used to capture the similarity of lexical semantics in different human languages. However, many applications, such as cross-lingual semantic search and question answering, can be largely benefited…

Computation and Language · Computer Science 2019-09-10 Muhao Chen , Yingtao Tian , Haochen Chen , Kai-Wei Chang , Steven Skiena , Carlo Zaniolo

Learning Meta-Embeddings by Using Ensembles of Embedding Sets

Word embeddings -- distributed representations of words -- in deep learning are beneficial for many tasks in natural language processing (NLP). However, different embedding sets vary greatly in quality and characteristics of the captured…

Computation and Language · Computer Science 2015-12-31 Wenpeng Yin , Hinrich Schütze

Multiple Word Embeddings for Increased Diversity of Representation

Most state-of-the-art models in natural language processing (NLP) are neural models built on top of large, pre-trained, contextual language models that generate representations of words in context and are fine-tuned for the task at hand.…

Computation and Language · Computer Science 2020-10-13 Brian Lester , Daniel Pressel , Amy Hemmeter , Sagnik Ray Choudhury , Srinivas Bangalore

Multilingual Distributed Representations without Word Alignment

Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not…

Computation and Language · Computer Science 2014-03-21 Karl Moritz Hermann , Phil Blunsom

Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We…

Computation and Language · Computer Science 2018-12-31 Matteo Pagliardini , Prakhar Gupta , Martin Jaggi

Beyond Shared Vocabulary: Increasing Representational Word Similarities across Languages for Multilingual Machine Translation

Using a vocabulary that is shared across languages is common practice in Multilingual Neural Machine Translation (MNMT). In addition to its simple design, shared tokens play an important role in positive knowledge transfer, assuming that…

Computation and Language · Computer Science 2024-01-23 Di Wu , Christof Monz

Multilingual acoustic word embedding models for processing zero-resource languages

Acoustic word embeddings are fixed-dimensional representations of variable-length speech segments. In settings where unlabelled speech is the only available resource, such embeddings can be used in "zero-resource" speech search, indexing…

Computation and Language · Computer Science 2020-02-24 Herman Kamper , Yevgen Matusevych , Sharon Goldwater

Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining

Existing models of multilingual sentence embeddings require large parallel data resources which are not available for low-resource languages. We propose a novel unsupervised method to derive multilingual sentence embeddings relying only on…

Computation and Language · Computer Science 2021-05-24 Ivana Kvapilıkova , Mikel Artetxe , Gorka Labaka , Eneko Agirre , Ondřej Bojar

A Survey Of Cross-lingual Word Embedding Models

Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages. In…

Computation and Language · Computer Science 2019-10-08 Sebastian Ruder , Ivan Vulić , Anders Søgaard

Syntax Representation in Word Embeddings and Neural Networks -- A Survey

Neural networks trained on natural language processing tasks capture syntax even though it is not provided as a supervision signal. This indicates that syntactic analysis is essential to the understating of language in artificial…

Computation and Language · Computer Science 2020-10-05 Tomasz Limisiewicz , David Mareček

Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech

In this paper, we explore the learning of neural network embeddings for natural images and speech waveforms describing the content of those images. These embeddings are learned directly from the waveforms without the use of linguistic…

Computation and Language · Computer Science 2018-04-10 David Harwath , Galen Chuang , James Glass

Polyglot: Distributed Word Representations for Multilingual NLP

Distributed word representations (word embeddings) have recently contributed to competitive performance in language modeling and several NLP tasks. In this work, we train word embeddings for more than 100 languages using their corresponding…

Computation and Language · Computer Science 2014-06-30 Rami Al-Rfou , Bryan Perozzi , Steven Skiena

Machine Translation with Cross-lingual Word Embeddings

Learning word embeddings using distributional information is a task that has been studied by many researchers, and a lot of studies are reported in the literature. On the contrary, less studies were done for the case of multiple languages.…

Computation and Language · Computer Science 2020-04-15 Marco Berlot , Evan Kaplan

Learning Crosslingual Word Embeddings without Bilingual Corpora

Crosslingual word embeddings represent lexical items from different languages in the same vector space, enabling transfer of NLP tools. However, previous attempts had expensive resource requirements, difficulty incorporating monolingual…

Computation and Language · Computer Science 2016-07-01 Long Duong , Hiroshi Kanayama , Tengfei Ma , Steven Bird , Trevor Cohn

On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning

Cross-lingual word embeddings are vector representations of words in different languages where words with similar meaning are represented by similar vectors, regardless of the language. Recent developments which construct these embeddings…

Computation and Language · Computer Science 2020-03-04 Yerai Doval , Jose Camacho-Collados , Luis Espinosa-Anke , Steven Schockaert