Related papers: Angular-Based Word Meta-Embedding Learning

Learning Meta-Embeddings by Using Ensembles of Embedding Sets

Word embeddings -- distributed representations of words -- in deep learning are beneficial for many tasks in natural language processing (NLP). However, different embedding sets vary greatly in quality and characteristics of the captured…

Computation and Language · Computer Science 2015-12-31 Wenpeng Yin , Hinrich Schütze

Meta-Embedding as Auxiliary Task Regularization

Word embeddings have been shown to benefit from ensambling several word embedding sources, often carried out using straightforward mathematical operations over the set of word vectors. More recently, self-supervised learning has been used…

Computation and Language · Computer Science 2020-01-27 James O' Neill , Danushka Bollegala

Think Globally, Embed Locally --- Locally Linear Meta-embedding of Words

Distributed word embeddings have shown superior performances in numerous Natural Language Processing (NLP) tasks. However, their performances vary significantly across different tasks, implying that the word embeddings learnt by those…

Computation and Language · Computer Science 2017-09-21 Danushka Bollegala , Kohei Hayashi , Ken-ichi Kawarabayashi

Learning Meta Word Embeddings by Unsupervised Weighted Concatenation of Source Embeddings

Given multiple source word embeddings learnt using diverse algorithms and lexical resources, meta word embedding learning methods attempt to learn more accurate and wide-coverage word embeddings. Prior work on meta-embedding has repeatedly…

Computation and Language · Computer Science 2022-04-27 Danushka Bollegala

Frustratingly Easy Meta-Embedding -- Computing Meta-Embeddings by Averaging Source Word Embeddings

Creating accurate meta-embeddings from pre-trained source embeddings has received attention lately. Methods based on global and locally-linear transformation and concatenation have shown to produce accurate meta-embeddings. In this paper,…

Computation and Language · Computer Science 2018-04-17 Joshua Coates , Danushka Bollegala

An Empirical Study on Post-processing Methods for Word Embeddings

Word embeddings learnt from large corpora have been adopted in various applications in natural language processing and served as the general input representations to learning systems. Recently, a series of post-processing methods have been…

Machine Learning · Computer Science 2019-10-25 Shuai Tang , Mahta Mousavi , Virginia R. de Sa

Learning Efficient Task-Specific Meta-Embeddings with Word Prisms

Word embeddings are trained to predict word cooccurrence statistics, which leads them to possess different lexical properties (syntactic, semantic, etc.) depending on the notion of context defined at training time. These properties manifest…

Computation and Language · Computer Science 2020-11-06 Jingyi He , KC Tsiolis , Kian Kenyon-Dean , Jackie Chi Kit Cheung

Unsupervised Attention-based Sentence-Level Meta-Embeddings from Contextualised Language Models

A variety of contextualised language models have been proposed in the NLP community, which are trained on diverse corpora to produce numerous Neural Language Models (NLMs). However, different NLMs have reported different levels of…

Computation and Language · Computer Science 2022-04-19 Keigo Takahashi , Danushka Bollegala

A Common Semantic Space for Monolingual and Cross-Lingual Meta-Embeddings

This paper presents a new technique for creating monolingual and cross-lingual meta-embeddings. Our method integrates multiple word embeddings created from complementary techniques, textual sources, knowledge bases and languages. Existing…

Computation and Language · Computer Science 2021-09-09 Iker García-Ferrero , Rodrigo Agerri , German Rigau

Reconstruction of Word Embeddings from Sub-Word Parameters

Pre-trained word embeddings improve the performance of a neural model at the cost of increasing the model size. We propose to benefit from this resource without paying the cost by operating strictly at the sub-lexical level. Our approach is…

Computation and Language · Computer Science 2017-07-24 Karl Stratos

Refinement of Unsupervised Cross-Lingual Word Embeddings

Cross-lingual word embeddings aim to bridge the gap between high-resource and low-resource languages by allowing to learn multilingual word representations even without using any direct bilingual signal. The lion's share of the methods are…

Computation and Language · Computer Science 2020-09-03 Magdalena Biesialska , Marta R. Costa-jussà

Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings

Sense embedding learning methods learn multiple vectors for a given ambiguous word, corresponding to its different word senses. For this purpose, different methods have been proposed in prior work on sense embedding learning that use…

Computation and Language · Computer Science 2023-05-31 Haochen Luo , Yi Zhou , Danushka Bollegala

A Survey on Word Meta-Embedding Learning

Meta-embedding (ME) learning is an emerging approach that attempts to learn more accurate word embeddings given existing (source) word embeddings as the sole input. Due to their ability to incorporate semantics from multiple source…

Computation and Language · Computer Science 2022-04-26 Danushka Bollegala , James O'Neill

Embedding Word Similarity with Neural Machine Translation

Neural language models learn word representations, or embeddings, that capture rich linguistic and conceptual information. Here we investigate the embeddings learned by neural machine translation models, a recently-developed class of neural…

Computation and Language · Computer Science 2015-04-06 Felix Hill , Kyunghyun Cho , Sebastien Jean , Coline Devin , Yoshua Bengio

Learning to Remove: Towards Isotropic Pre-trained BERT Embedding

Pre-trained language models such as BERT have become a more common choice of natural language processing (NLP) tasks. Research in word representation shows that isotropic embeddings can significantly improve performance on downstream tasks.…

Computation and Language · Computer Science 2021-08-30 Yuxin Liang , Rui Cao , Jie Zheng , Jie Ren , Ling Gao

Asynchronous Training of Word Embeddings for Large Text Corpora

Word embeddings are a powerful approach for analyzing language and have been widely popular in numerous tasks in information retrieval and text mining. Training embeddings over huge corpora is computationally expensive because the input is…

Machine Learning · Computer Science 2018-12-11 Avishek Anand , Megha Khosla , Jaspreet Singh , Jan-Hendrik Zab , Zijian Zhang

Meta-Embeddings for Natural Language Inference and Semantic Similarity tasks

Word Representations form the core component for almost all advanced Natural Language Processing (NLP) applications such as text mining, question-answering, and text summarization, etc. Over the last two decades, immense research is…

Computation and Language · Computer Science 2020-12-02 Shree Charran R , Rahul Kumar Dubey

Dynamic Meta-Embeddings for Improved Sentence Representations

While one of the first steps in many NLP systems is selecting what pre-trained word embeddings to use, we argue that such a step is better left for neural networks to figure out by themselves. To that end, we introduce dynamic…

Computation and Language · Computer Science 2018-09-06 Douwe Kiela , Changhan Wang , Kyunghyun Cho

Learning Unsupervised Word Mapping by Maximizing Mean Discrepancy

Cross-lingual word embeddings aim to capture common linguistic regularities of different languages, which benefit various downstream tasks ranging from machine translation to transfer learning. Recently, it has been shown that these…

Computation and Language · Computer Science 2018-11-02 Pengcheng Yang , Fuli Luo , Shuangzhi Wu , Jingjing Xu , Dongdong Zhang , Xu Sun

Cross-Lingual Word Embedding Refinement by $\ell_{1}$ Norm Optimisation

Cross-Lingual Word Embeddings (CLWEs) encode words from two or more languages in a shared high-dimensional space in which vectors representing words with similar meaning (regardless of language) are closely located. Existing methods for…

Computation and Language · Computer Science 2022-01-25 Xutan Peng , Chenghua Lin , Mark Stevenson