Related papers: Word Equations: Inherently Interpretable Sparse Wo…
Interpretability benefits the theoretical understanding of representations. Existing word embeddings are generally dense representations. Hence, the meaning of latent dimensions is difficult to interpret. This makes word embeddings like a…
Word embeddings have found their way into a wide range of natural language processing tasks including those in the biomedical domain. While these vector representations successfully capture semantic and syntactic word relations, hidden…
Previous research on word embeddings has shown that sparse representations, which can be either learned on top of existing dense embeddings or obtained through model constraints during training time, have the benefit of increased…
Dense word embeddings, which encode semantic meanings of words to low dimensional vector spaces have become very popular in natural language processing (NLP) research due to their state-of-the-art performances in many NLP tasks. Word…
As an ubiquitous method in natural language processing, word embeddings are extensively employed to map semantic properties of words into a dense vector representation. They capture semantic and syntactic relations among words but the…
Word embeddings are rich word representations, which in combination with deep neural networks, lead to large performance gains for many NLP tasks. However, word embeddings are represented by dense, real-valued vectors and they are therefore…
Word embeddings are a basic building block of modern NLP pipelines. Efforts have been made to learn rich, efficient, and interpretable embeddings for large generic datasets available in the public domain. However, these embeddings have…
Despite the success achieved on various natural language processing tasks, word embeddings are difficult to interpret due to the dense vector representations. This paper focuses on interpreting the embeddings for various aspects, including…
Current distributed representations of words show little resemblance to theories of lexical semantics. The former are dense and uninterpretable, the latter largely based on familiar, discrete classes (e.g., supersenses) and relations (e.g.,…
Pre-trained word embeddings are widely used for transfer learning in natural language processing. The embeddings are continuous and distributed representations of the words that preserve their similarities in compact Euclidean spaces.…
Word embeddings have demonstrated strong performance on NLP tasks. However, lack of interpretability and the unsupervised nature of word embeddings have limited their use within computational social science and digital humanities. We…
Distributional models provide a convenient way to model semantics using dense embedding spaces derived from unsupervised learning algorithms. However, the dimensions of dense embedding spaces are not designed to resemble human semantic…
Prediction without justification has limited utility. Much of the success of neural models can be attributed to their ability to learn rich, dense and expressive representations. While these representations capture the underlying complexity…
Embedding spaces contain interpretable dimensions indicating gender, formality in style, or even object properties. This has been observed multiple times. Such interpretable dimensions are becoming valuable tools in different areas of…
Text embeddings are a fundamental component in many NLP tasks, including classification, regression, clustering, and semantic search. However, despite their ubiquitous application, challenges persist in interpreting embeddings and…
Word embedding is a Natural Language Processing (NLP) technique that automatically maps words from a vocabulary to vectors of real numbers in an embedding space. It has been widely used in recent years to boost the performance of a vari-ety…
Neural network-based language models deal with data sparsity problems by mapping the large discrete space of words into a smaller continuous space of real-valued vectors. By learning distributed vector representations for words, each…
Word embeddings are ubiquitous in NLP and information retrieval, but it is unclear what they represent when the word is polysemous. Here it is shown that multiple word senses reside in linear superposition within the word embedding and…
Word embeddings are a key component of high-performing natural language processing (NLP) systems, but it remains a challenge to learn good representations for novel words on the fly, i.e., for words that did not occur in the training data.…
Neural machine translation has achieved remarkable empirical performance over standard benchmark datasets, yet recent evidence suggests that the models can still fail easily dealing with substandard inputs such as misspelled words, To…