Related papers: Deconstructing word embedding algorithms

Deconstructing and reconstructing word embedding algorithms

Uncontextualized word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Given the historical success of word embeddings in NLP, we propose a retrospective on some of…

Computation and Language · Computer Science 2019-12-02 Edward Newell , Kian Kenyon-Dean , Jackie Chi Kit Cheung

Word Embeddings Are Capable of Capturing Rhythmic Similarity of Words

Word embedding systems such as Word2Vec and GloVe are well-known in deep learning approaches to NLP. This is largely due to their ability to capture semantic relationships between words. In this work we investigated their usefulness in…

Computation and Language · Computer Science 2022-04-15 Hosein Rezaei

What the Vec? Towards Probabilistically Grounded Embeddings

Word2Vec (W2V) and GloVe are popular, fast and efficient word embedding algorithms. Their embeddings are widely used and perform well on a variety of natural language processing tasks. Moreover, W2V has recently been adopted in the field of…

Computation and Language · Computer Science 2019-11-12 Carl Allen , Ivana Balažević , Timothy Hospedales

Word Embedding Algorithms as Generalized Low Rank Models and their Canonical Form

Word embedding algorithms produce very reliable feature representations of words that are used by neural network models across a constantly growing multitude of NLP tasks. As such, it is imperative for NLP practitioners to understand how…

Computation and Language · Computer Science 2019-11-11 Kian Kenyon-Dean

Neural-based Noise Filtering from Word Embeddings

Word embeddings have been demonstrated to benefit NLP tasks impressively. Yet, there is room for improvement in the vector representations, because current word embeddings typically contain unnecessary information, i.e., noise. We propose…

Computation and Language · Computer Science 2016-10-07 Kim Anh Nguyen , Sabine Schulte im Walde , Ngoc Thang Vu

word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement

Deep learning natural language processing models often use vector word embeddings, such as word2vec or GloVe, to represent words. A discrete sequence of words can be much more easily integrated with downstream neural layers if it is…

Machine Learning · Computer Science 2020-03-04 Aliakbar Panahi , Seyran Saeedi , Tom Arodz

Affect Enriched Word Embeddings for News Information Retrieval

Distributed representations of words have shown to be useful to improve the effectiveness of IR systems in many sub-tasks like query expansion, retrieval and ranking. Algorithms like word2vec, GloVe and others are also key factors in many…

Information Retrieval · Computer Science 2019-09-05 Tommaso Teofili , Niyati Chhaya

Tsetlin Machine Embedding: Representing Words Using Logical Expressions

Embedding words in vector space is a fundamental first step in state-of-the-art natural language processing (NLP). Typical NLP solutions employ pre-defined vector representations to improve generalization by co-locating similar words in…

Computation and Language · Computer Science 2023-01-03 Bimal Bhattarai , Ole-Christoffer Granmo , Lei Jiao , Rohan Yadav , Jivitesh Sharma

Emotional Embeddings: Refining Word Embeddings to Capture Emotional Content of Words

Word embeddings are one of the most useful tools in any modern natural language processing expert's toolkit. They contain various types of information about each word which makes them the best way to represent the terms in any NLP task. But…

Computation and Language · Computer Science 2019-06-20 Armin Seyeditabari , Narges Tabari , Shafie Gholizade , Wlodek Zadrozny

WOVe: Incorporating Word Order in GloVe Word Embeddings

Word vector representations open up new opportunities to extract useful information from unstructured text. Defining a word as a vector made it easy for the machine learning algorithms to understand a text and extract information from. Word…

Computation and Language · Computer Science 2021-05-19 Mohammed Ibrahim , Susan Gauch , Tyler Gerth , Brandon Cox

Hash2Vec, Feature Hashing for Word Embeddings

In this paper we propose the application of feature hashing to create word embeddings for natural language processing. Feature hashing has been used successfully to create document vectors in related tasks like document classification. In…

Computation and Language · Computer Science 2017-04-18 Luis Argerich , Joaquín Torré Zaffaroni , Matías J Cano

Word Embeddings: Stability and Semantic Change

Word embeddings are computed by a class of techniques within natural language processing (NLP), that create continuous vector representations of words in a language from a large text corpus. The stochastic nature of the training process of…

Computation and Language · Computer Science 2020-08-03 Lucas Rettenmeier

Towards a Theoretical Understanding of Word and Relation Representation

Representing words by vectors, or embeddings, enables computational reasoning and is foundational to automating natural language tasks. For example, if word embeddings of similar words contain similar values, word similarity can be readily…

Computation and Language · Computer Science 2022-02-02 Carl Allen

Word Embeddings for Banking Industry

Applications of Natural Language Processing (NLP) are plentiful, from sentiment analysis to text classification. Practitioners rely on static word embeddings (e.g. Word2Vec or GloVe) or static word representation from contextual models…

Computation and Language · Computer Science 2023-06-06 Avnish Patel

SPINE: SParse Interpretable Neural Embeddings

Prediction without justification has limited utility. Much of the success of neural models can be attributed to their ability to learn rich, dense and expressive representations. While these representations capture the underlying complexity…

Computation and Language · Computer Science 2017-11-27 Anant Subramanian , Danish Pruthi , Harsh Jhamtani , Taylor Berg-Kirkpatrick , Eduard Hovy

An Ensemble Method to Produce High-Quality Word Embeddings (2016)

A currently successful approach to computational semantics is to represent words as embeddings in a machine-learned vector space. We present an ensemble method that combines embeddings produced by GloVe (Pennington et al., 2014) and…

Computation and Language · Computer Science 2019-12-20 Robyn Speer , Joshua Chin

Revisiting Word Embeddings in the LLM Era

Large Language Models (LLMs) have recently shown remarkable advancement in various NLP tasks. As such, a popular trend has emerged lately where NLP researchers extract word/sentence/document embeddings from these large decoder-only models…

Computation and Language · Computer Science 2025-03-04 Yash Mahajan , Matthew Freestone , Sathyanarayanan Aakur , Santu Karmaker

Revisiting Word Embeddings in the LLM Era

Large Language Models (LLMs) have recently shown remarkable advancement in various NLP tasks. As such, a popular trend has emerged lately where NLP researchers extract word/sentence/document embeddings from these large decoder-only models…

Computation and Language · Computer Science 2025-03-04 Yash Mahajan , Matthew Freestone , Naman Bansal , Sathyanarayanan Aakur , Shubhra Kanti Karmaker Santu

From Word Vectors to Multimodal Embeddings: Techniques, Applications, and Future Directions For Large Language Models

Word embeddings and language models have transformed natural language processing (NLP) by facilitating the representation of linguistic elements in continuous vector spaces. This review visits foundational concepts such as the…

Computation and Language · Computer Science 2025-12-03 Charles Zhang , Benji Peng , Xintian Sun , Qian Niu , Junyu Liu , Keyu Chen , Ming Li , Pohsun Feng , Ziqian Bi , Ming Liu , Yichao Zhang , Xinyuan Song , Cheng Fei , Caitlyn Heqi Yin , Lawrence KQ Yan , Hongyang He , Tianyang Wang

Intrinsic analysis for dual word embedding space models

Recent word embeddings techniques represent words in a continuous vector space, moving away from the atomic and sparse representations of the past. Each such technique can further create multiple varieties of embeddings based on different…

Computation and Language · Computer Science 2020-12-08 Mohit Mayank