Related papers: Fusing Vector Space Models for Domain-Specific App…
Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an…
Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora. However, their creation process does not allow the different meanings of a word to be…
Generic word embeddings are trained on large-scale generic corpora; Domain Specific (DS) word embeddings are trained only on data from a domain of interest. This paper proposes a method to combine the breadth of generic embeddings with the…
This work presents a new and simple approach for fine-tuning pretrained word embeddings for text classification tasks. In this approach, the class in which a term appears, acts as an additional contextual variable during the fine tuning…
Sentence embeddings enable us to capture the semantic similarity of short texts. Most sentence embedding models are trained for general semantic textual similarity tasks. Therefore, to use sentence embeddings in a particular domain, the…
The meaning of a word often varies depending on its usage in different domains. The standard word embedding models struggle to represent this variation, as they learn a single global representation for a word. We propose a method to learn…
This paper proposes a way to improve the performance of existing algorithms for text classification in domains with strong language semantics. We propose a domain adaptation layer learns weights to combine a generic and a domain specific…
As a fundamental task in natural language processing, word embedding converts each word into a representation in a vector space. A challenge with word embedding is that as the vocabulary grows, the vector space's dimension increases, which…
Word embeddings have been widely used in sentiment classification because of their efficacy for semantic representations of words. Given reviews from different domains, some existing methods for word embeddings exploit sentiment…
Conventional word embeddings represent words with fixed vectors, which are usually trained based on co-occurrence patterns among words. In doing so, however, the power of such representations is limited, where the same word might be…
There are two main approaches to the distributed representation of words: low-dimensional deep learning embeddings and high-dimensional distributional models, in which each dimension corresponds to a context word. In this paper, we combine…
Semantic similarity measures are an important part in Natural Language Processing tasks. However Semantic similarity measures built for general use do not perform well within specific domains. Therefore in this study we introduce a domain…
Word embedding is a Natural Language Processing (NLP) technique that automatically maps words from a vocabulary to vectors of real numbers in an embedding space. It has been widely used in recent years to boost the performance of a vari-ety…
We investigate the problem of inducing word embeddings that are tailored for a particular bilexical relation. Our learning algorithm takes an existing lexical vector space and compresses it such that the resulting word embeddings are good…
Complementary to finding good general word embeddings, an important question for representation learning is to find dynamic word embeddings, e.g., across time or domain. Current methods do not offer a way to use or predict information on…
Learning high-quality domain word embeddings is important for achieving good performance in many NLP tasks. General-purpose embeddings trained on large-scale corpora are often sub-optimal for domain-specific applications. However,…
Word embeddings are traditionally trained on a large corpus in an unsupervised setting, with no specific design for incorporating domain knowledge. This can lead to unsatisfactory performances when training data originate from heterogeneous…
This paper have two parts. In the first part we discuss word embeddings. We discuss the need for them, some of the methods to create them, and some of their interesting properties. We also compare them to image embeddings and see how word…
Word embeddings -- distributed representations of words -- in deep learning are beneficial for many tasks in natural language processing (NLP). However, different embedding sets vary greatly in quality and characteristics of the captured…
Word embeddings have gained significant attention as learnable representations of semantic relations between words, and have been shown to improve upon the results of traditional word representations. However, little effort has been devoted…