Related papers: Compressing Word Embeddings

Word Embedding Dimension Reduction via Weakly-Supervised Feature Selection

As a fundamental task in natural language processing, word embedding converts each word into a representation in a vector space. A challenge with word embedding is that as the vocabulary grows, the vector space's dimension increases, which…

Computation and Language · Computer Science 2024-11-05 Jintang Xue , Yun-Cheng Wang , Chengwei Wei , C. -C. Jay Kuo

A Comprehensive Empirical Evaluation of Existing Word Embedding Approaches

Vector-based word representations help countless Natural Language Processing (NLP) tasks capture the language's semantic and syntactic regularities. In this paper, we present the characteristics of existing word embedding approaches and…

Computation and Language · Computer Science 2024-03-05 Obaidullah Zaland , Muhammad Abulaish , Mohd. Fazil

Comparative Analysis of Word Embeddings for Capturing Word Similarities

Distributed language representation has become the most widely used technique for language representation in various natural language processing tasks. Most of the natural language processing models that are based on deep learning…

Computation and Language · Computer Science 2020-05-11 Martina Toshevska , Frosina Stojanovska , Jovan Kalajdjieski

Near-lossless Binarization of Word Embeddings

Word embeddings are commonly used as a starting point in many NLP models to achieve state-of-the-art performances. However, with a large vocabulary and many dimensions, these floating-point representations are expensive both in terms of…

Computation and Language · Computer Science 2020-01-23 Julien Tissier , Christophe Gravier , Amaury Habrard

Analyzing Structures in the Semantic Vector Space: A Framework for Decomposing Word Embeddings

Word embeddings are rich word representations, which in combination with deep neural networks, lead to large performance gains for many NLP tasks. However, word embeddings are represented by dense, real-valued vectors and they are therefore…

Computation and Language · Computer Science 2019-12-24 Andreas Hanselowski , Iryna Gurevych

Imparting Interpretability to Word Embeddings while Preserving Semantic Structure

As an ubiquitous method in natural language processing, word embeddings are extensively employed to map semantic properties of words into a dense vector representation. They capture semantic and syntactic relations among words but the…

Computation and Language · Computer Science 2020-07-03 Lutfi Kerem Senel , Ihsan Utlu , Furkan Şahinuç , Haldun M. Ozaktas , Aykut Koç

Learning Compressed Sentence Representations for On-Device Text Processing

Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems. The learned representations are generally assumed to be continuous and real-valued,…

Computation and Language · Computer Science 2019-06-21 Dinghan Shen , Pengyu Cheng , Dhanasekar Sundararaman , Xinyuan Zhang , Qian Yang , Meng Tang , Asli Celikyilmaz , Lawrence Carin

Compressing and Interpreting Word Embeddings with Latent Space Regularization and Interactive Semantics Probing

Word embedding, a high-dimensional (HD) numerical representation of words generated by machine learning models, has been used for different natural language processing tasks, e.g., translation between two languages. Recently, there has been…

Human-Computer Interaction · Computer Science 2024-03-26 Haoyu Li , Junpeng Wang , Yan Zheng , Liang Wang , Wei Zhang , Han-Wei Shen

Word Embedding Visualization Via Dictionary Learning

Co-occurrence statistics based word embedding techniques have proved to be very useful in extracting the semantic and syntactic representation of words as low dimensional continuous vectors. In this work, we discovered that dictionary…

Computation and Language · Computer Science 2021-03-16 Juexiao Zhang , Yubei Chen , Brian Cheung , Bruno A Olshausen

Efficient Estimation of Word Representations in Vector Space

We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the…

Computation and Language · Computer Science 2013-09-10 Tomas Mikolov , Kai Chen , Greg Corrado , Jeffrey Dean

An Empirical Study on Post-processing Methods for Word Embeddings

Word embeddings learnt from large corpora have been adopted in various applications in natural language processing and served as the general input representations to learning systems. Recently, a series of post-processing methods have been…

Machine Learning · Computer Science 2019-10-25 Shuai Tang , Mahta Mousavi , Virginia R. de Sa

Sentence Analogies: Exploring Linguistic Relationships and Regularities in Sentence Embeddings

While important properties of word vector representations have been studied extensively, far less is known about the properties of sentence vector representations. Word vectors are often evaluated by assessing to what degree they exhibit…

Computation and Language · Computer Science 2020-03-10 Xunjie Zhu , Gerard de Melo

Embedding Grammars

Classic grammars and regular expressions can be used for a variety of purposes, including parsing, intent detection, and matching. However, the comparisons are performed at a structural level, with constituent elements (words or characters)…

Computation and Language · Computer Science 2018-08-16 David Wingate , William Myers , Nancy Fulda , Tyler Etchart

Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization

Word embedding methods revolve around learning continuous distributed vector representations of words with neural networks, which can capture semantic and/or syntactic cues, and in turn be used to induce similarity measures among words,…

Computation and Language · Computer Science 2016-07-25 Kuan-Yu Chen , Shih-Hung Liu , Berlin Chen , Hsin-Min Wang , Hsin-Hsi Chen

A Comparative Study of Word Embeddings for Reading Comprehension

The focus of past machine learning research for Reading Comprehension tasks has been primarily on the design of novel deep learning architectures. Here we show that seemingly minor choices made on (1) the use of pre-trained word embeddings,…

Computation and Language · Computer Science 2017-03-06 Bhuwan Dhingra , Hanxiao Liu , Ruslan Salakhutdinov , William W. Cohen

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora. However, their creation process does not allow the different meanings of a word to be…

Computation and Language · Computer Science 2017-06-22 Massimiliano Mancini , Jose Camacho-Collados , Ignacio Iacobacci , Roberto Navigli

An Investigation on Word Embedding Offset Clustering as Relationship Classification

Vector representations obtained from word embedding are the source of many groundbreaking advances in natural language processing. They yield word representations that are capable of capturing semantics and analogies of words within a text…

Computation and Language · Computer Science 2023-05-09 Didier Gohourou , Kazuhiro Kuwabara

WordRank: Learning Word Embeddings via Robust Ranking

Embedding words in a vector space has gained a lot of attention in recent years. While state-of-the-art methods provide efficient computation of word similarities via a low-dimensional matrix embedding, their motivation is often left…

Computation and Language · Computer Science 2016-09-29 Shihao Ji , Hyokun Yun , Pinar Yanardag , Shin Matsushima , S. V. N. Vishwanathan

word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement

Deep learning natural language processing models often use vector word embeddings, such as word2vec or GloVe, to represent words. A discrete sequence of words can be much more easily integrated with downstream neural layers if it is…

Machine Learning · Computer Science 2020-03-04 Aliakbar Panahi , Seyran Saeedi , Tom Arodz

Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research

Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an…

Computation and Language · Computer Science 2018-07-11 Vincent Major , Alisa Surkis , Yindalon Aphinyanaphongs