Related papers: Bilingual Distributed Word Representations from Do…

Unsupervised Multilingual Word Embeddings

Multilingual Word Embeddings (MWEs) represent words from multiple languages in a single distributional vector space. Unsupervised MWE (UMWE) methods acquire multilingual embeddings without cross-lingual supervision, which is a significant…

Computation and Language · Computer Science 2018-09-07 Xilun Chen , Claire Cardie

Multilingual Models for Compositional Distributed Semantics

We present a novel technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of…

Computation and Language · Computer Science 2014-04-21 Karl Moritz Hermann , Phil Blunsom

CLUSE: Cross-Lingual Unsupervised Sense Embeddings

This paper proposes a modularized sense induction and representation learning model that jointly learns bilingual sense embeddings that align well in the vector space, where the cross-lingual signal in the English-Chinese parallel corpus is…

Computation and Language · Computer Science 2018-10-23 Ta-Chung Chi , Yun-Nung Chen

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

Word embeddings, which represent a word as a point in a vector space, have become ubiquitous to several NLP tasks. A recent line of work uses bilingual (two languages) corpora to learn a different vector for each sense of a word, by…

Computation and Language · Computer Science 2017-06-27 Shyam Upadhyay , Kai-Wei Chang , Matt Taddy , Adam Kalai , James Zou

Robust Cross-lingual Embeddings from Parallel Sentences

Recent advances in cross-lingual word embeddings have primarily relied on mapping-based methods, which project pretrained word embeddings from different languages into a shared space through a linear transformation. However, these…

Computation and Language · Computer Science 2020-05-04 Ali Sabet , Prakhar Gupta , Jean-Baptiste Cordonnier , Robert West , Martin Jaggi

Attention Word Embedding

Word embedding models learn semantically rich vector representations of words and are widely used to initialize natural processing language (NLP) models. The popular continuous bag-of-words (CBOW) model of word2vec learns a vector embedding…

Computation and Language · Computer Science 2020-06-02 Shashank Sonkar , Andrew E. Waters , Richard G. Baraniuk

BilBOWA: Fast Bilingual Distributed Representations without Word Alignments

We introduce BilBOWA (Bilingual Bag-of-Words without Alignments), a simple and computationally-efficient model for learning bilingual distributed representations of words which can scale to large monolingual datasets and does not require…

Machine Learning · Statistics 2016-02-05 Stephan Gouws , Yoshua Bengio , Greg Corrado

Leveraging multilingual transfer for unsupervised semantic acoustic word embeddings

Acoustic word embeddings (AWEs) are fixed-dimensional vector representations of speech segments that encode phonetic content so that different realisations of the same word have similar embeddings. In this paper we explore semantic AWE…

Audio and Speech Processing · Electrical Eng. & Systems 2023-07-06 Christiaan Jacobs , Herman Kamper

Learning to Represent Bilingual Dictionaries

Bilingual word embeddings have been widely used to capture the similarity of lexical semantics in different human languages. However, many applications, such as cross-lingual semantic search and question answering, can be largely benefited…

Computation and Language · Computer Science 2019-09-10 Muhao Chen , Yingtao Tian , Haochen Chen , Kai-Wei Chang , Steven Skiena , Carlo Zaniolo

Category Enhanced Word Embedding

Distributed word representations have been demonstrated to be effective in capturing semantic and syntactic regularities. Unsupervised representation learning from large unlabeled corpora can learn similar representations for those words…

Computation and Language · Computer Science 2015-12-01 Chunting Zhou , Chonglin Sun , Zhiyuan Liu , Francis C. M. Lau

Bilingual Learning of Multi-sense Embeddings with Discrete Autoencoders

We present an approach to learning multi-sense word embeddings relying both on monolingual and bilingual information. Our model consists of an encoder, which uses monolingual and bilingual context (i.e. a parallel sentence) to choose a…

Computation and Language · Computer Science 2016-03-31 Simon Šuster , Ivan Titov , Gertjan van Noord

Neural approaches to spoken content embedding

Comparing spoken segments is a central operation to speech processing. Traditional approaches in this area have favored frame-level dynamic programming algorithms, such as dynamic time warping, because they require no supervision, but they…

Computation and Language · Computer Science 2023-08-30 Shane Settle

Learning Bilingual Word Embeddings Using Lexical Definitions

Bilingual word embeddings, which representlexicons of different languages in a shared em-bedding space, are essential for supporting se-mantic and knowledge transfers in a variety ofcross-lingual NLP tasks. Existing approachesto training…

Computation and Language · Computer Science 2020-01-07 Weijia Shi , Muhao Chen , Yingtao Tian , Kai-Wei Chang

Bilingual Embeddings with Random Walks over Multilingual Wordnets

Bilingual word embeddings represent words of two languages in the same space, and allow to transfer knowledge from one language to the other without machine translation. The main approach is to train monolingual embeddings first and then…

Computation and Language · Computer Science 2018-04-24 J. Goikoetxea , A. Soroa , E. Agirre

Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings

Models of acoustic word embeddings (AWEs) learn to map variable-length spoken word segments onto fixed-dimensionality vector representations such that different acoustic exemplars of the same word are projected nearby in the embedding…

Computation and Language · Computer Science 2022-09-20 Badr M. Abdullah , Bernd Möbius , Dietrich Klakow

Multilingual Word Embeddings using Multigraphs

We present a family of neural-network--inspired models for computing continuous word representations, specifically designed to exploit both monolingual and multilingual text. This framework allows us to perform unsupervised training of…

Computation and Language · Computer Science 2016-12-15 Radu Soricut , Nan Ding

Bilingual Lexicon Induction through Unsupervised Machine Translation

A recent research line has obtained strong results on bilingual lexicon induction by aligning independently trained word embeddings in two languages and using the resulting cross-lingual embeddings to induce word translation pairs through…

Computation and Language · Computer Science 2021-12-28 Mikel Artetxe , Gorka Labaka , Eneko Agirre

Learning Word Embeddings from Intrinsic and Extrinsic Views

While word embeddings are currently predominant for natural language processing, most of existing models learn them solely from their contexts. However, these context-based word embeddings are limited since not all words' meaning can be…

Computation and Language · Computer Science 2016-08-23 Jifan Chen , Kan Chen , Xipeng Qiu , Qi Zhang , Xuanjing Huang , Zheng Zhang

hauWE: Hausa Words Embedding for Natural Language Processing

Words embedding (distributed word vector representations) have become an essential component of many natural language processing (NLP) tasks such as machine translation, sentiment analysis, word analogy, named entity recognition and word…

Computation and Language · Computer Science 2020-01-08 Idris Abdulmumin , Bashir Shehu Galadanci

Comparative Analysis of Word Embeddings for Capturing Word Similarities

Distributed language representation has become the most widely used technique for language representation in various natural language processing tasks. Most of the natural language processing models that are based on deep learning…

Computation and Language · Computer Science 2020-05-11 Martina Toshevska , Frosina Stojanovska , Jovan Kalajdjieski