Related papers: Semantic Vector Machines

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora. However, their creation process does not allow the different meanings of a word to be…

Computation and Language · Computer Science 2017-06-22 Massimiliano Mancini , Jose Camacho-Collados , Ignacio Iacobacci , Roberto Navigli

Semantic Sentence Embeddings for Paraphrasing and Text Summarization

This paper introduces a sentence to vector encoding framework suitable for advanced natural language processing. Our latent representation is shown to encode sentences with common semantic information with similar vector representations.…

Computation and Language · Computer Science 2018-09-30 Chi Zhang , Shagan Sah , Thang Nguyen , Dheeraj Peri , Alexander Loui , Carl Salvaggio , Raymond Ptucha

Neural Embeddings for Text

We propose a new kind of embedding for natural language text that deeply represents semantic meaning. Standard text embeddings use the outputs from hidden layers of a pretrained language model. In our method, we let a language model learn…

Computation and Language · Computer Science 2022-11-22 Oleg Vasilyev , John Bohannon

Improved Semantic-Aware Network Embedding with Fine-Grained Word Alignment

Network embeddings, which learn low-dimensional representations for each vertex in a large-scale network, have received considerable attention in recent years. For a wide range of applications, vertices in a network are typically…

Computation and Language · Computer Science 2018-08-30 Dinghan Shen , Xinyuan Zhang , Ricardo Henao , Lawrence Carin

Word Embeddings and Their Use In Sentence Classification Tasks

This paper have two parts. In the first part we discuss word embeddings. We discuss the need for them, some of the methods to create them, and some of their interesting properties. We also compare them to image embeddings and see how word…

Machine Learning · Computer Science 2016-10-27 Amit Mandelbaum , Adi Shalev

Tensorized Embedding Layers for Efficient Model Compression

The embedding layers transforming input words into real vectors are the key components of deep neural networks used in natural language processing. However, when the vocabulary is large, the corresponding weight matrices can be enormous,…

Computation and Language · Computer Science 2020-02-20 Oleksii Hrinchuk , Valentin Khrulkov , Leyla Mirvakhabova , Elena Orlova , Ivan Oseledets

Charagram: Embedding Words and Sentences via Character n-grams

We present Charagram embeddings, a simple approach for learning character-based compositional models to embed textual sequences. A word or sentence is represented using a character n-gram count vector, followed by a single nonlinear…

Computation and Language · Computer Science 2016-07-12 John Wieting , Mohit Bansal , Kevin Gimpel , Karen Livescu

Embedding Word Similarity with Neural Machine Translation

Neural language models learn word representations, or embeddings, that capture rich linguistic and conceptual information. Here we investigate the embeddings learned by neural machine translation models, a recently-developed class of neural…

Computation and Language · Computer Science 2015-04-06 Felix Hill , Kyunghyun Cho , Sebastien Jean , Coline Devin , Yoshua Bengio

Analyzing Structures in the Semantic Vector Space: A Framework for Decomposing Word Embeddings

Word embeddings are rich word representations, which in combination with deep neural networks, lead to large performance gains for many NLP tasks. However, word embeddings are represented by dense, real-valued vectors and they are therefore…

Computation and Language · Computer Science 2019-12-24 Andreas Hanselowski , Iryna Gurevych

Learning Compressed Sentence Representations for On-Device Text Processing

Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems. The learned representations are generally assumed to be continuous and real-valued,…

Computation and Language · Computer Science 2019-06-21 Dinghan Shen , Pengyu Cheng , Dhanasekar Sundararaman , Xinyuan Zhang , Qian Yang , Meng Tang , Asli Celikyilmaz , Lawrence Carin

Context-Dependent Word Representation for Neural Machine Translation

We first observe a potential weakness of continuous vector representations of symbols in neural machine translation. That is, the continuous vector representation, or a word embedding vector, of a symbol encodes multiple dimensions of…

Computation and Language · Computer Science 2016-07-05 Heeyoul Choi , Kyunghyun Cho , Yoshua Bengio

A Bilingual Generative Transformer for Semantic Sentence Embedding

Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such…

Computation and Language · Computer Science 2020-11-20 John Wieting , Graham Neubig , Taylor Berg-Kirkpatrick

A Comparative Study on Structural and Semantic Properties of Sentence Embeddings

Sentence embeddings encode natural language sentences as low-dimensional dense vectors. A great deal of effort has been put into using sentence embeddings to improve several important natural language processing tasks. Relation extraction…

Computation and Language · Computer Science 2020-09-24 Alexander Kalinowski , Yuan An

Learning Joint Multilingual Sentence Representations with Neural Machine Translation

In this paper, we use the framework of neural machine translation to learn joint sentence representations across six very different languages. Our aim is that a representation which is independent of the language, is likely to capture the…

Computation and Language · Computer Science 2017-08-09 Holger Schwenk , Matthijs Douze

Sentence transition matrix: An efficient approach that preserves sentence semantics

Sentence embedding is a significant research topic in the field of natural language processing (NLP). Generating sentence embedding vectors reflecting the intrinsic meaning of a sentence is a key factor to achieve an enhanced performance in…

Computation and Language · Computer Science 2019-01-17 Myeongjun Jang , Pilsung Kang

Neural Machine Translation by Jointly Learning to Align and Translate

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to…

Computation and Language · Computer Science 2016-05-23 Dzmitry Bahdanau , Kyunghyun Cho , Yoshua Bengio

Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization

Word embedding methods revolve around learning continuous distributed vector representations of words with neural networks, which can capture semantic and/or syntactic cues, and in turn be used to induce similarity measures among words,…

Computation and Language · Computer Science 2016-07-25 Kuan-Yu Chen , Shih-Hung Liu , Berlin Chen , Hsin-Min Wang , Hsin-Hsi Chen

Character n-gram Embeddings to Improve RNN Language Models

This paper proposes a novel Recurrent Neural Network (RNN) language model that takes advantage of character information. We focus on character n-grams based on research in the field of word embedding construction (Wieting et al. 2016). Our…

Computation and Language · Computer Science 2019-06-14 Sho Takase , Jun Suzuki , Masaaki Nagata

Text Simplification with Sentence Embeddings

Sentence embeddings can be decoded to give approximations of the original texts used to create them. We explore this effect in the context of text simplification, demonstrating that reconstructed text embeddings preserve complexity levels.…

Computation and Language · Computer Science 2025-10-29 Matthew Shardlow

Text segmentation with character-level text embeddings

Learning word representations has recently seen much success in computational linguistics. However, assuming sequences of word tokens as input to linguistic analysis is often unjustified. For many languages word segmentation is a…

Computation and Language · Computer Science 2013-09-19 Grzegorz Chrupała