English
Related papers

Related papers: Skip-Thought Vectors

200 papers

We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. The model is learned by using a convolutional neural network as an encoder to map an input sentence into a…

Computation and Language · Computer Science 2017-07-28 Zhe Gan , Yunchen Pu , Ricardo Henao , Chunyuan Li , Xiaodong He , Lawrence Carin

The skip-thought model has been proven to be effective at learning sentence representations and capturing sentence semantics. In this paper, we propose a suite of techniques to trim and improve it. First, we validate a hypothesis that,…

Computation and Language · Computer Science 2017-06-13 Shuai Tang , Hailin Jin , Chen Fang , Zhaowen Wang , Virginia R. de Sa

There is a lot of research interest in encoding variable length sentences into fixed length vectors, in a way that preserves the sentence meanings. Two common methods include representations based on averaging word vectors, and…

Computation and Language · Computer Science 2017-02-10 Yossi Adi , Einat Kermany , Yonatan Belinkov , Ofer Lavi , Yoav Goldberg

Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such…

Computation and Language · Computer Science 2020-11-20 John Wieting , Graham Neubig , Taylor Berg-Kirkpatrick

The encoder-decoder models for unsupervised sentence representation learning tend to discard the decoder after being trained on a large unlabelled corpus, since only the encoder is needed to map the input sentence into a vector…

Neural and Evolutionary Computing · Computer Science 2019-06-03 Shuai Tang , Virginia R. de Sa

A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on large amounts of text in an unsupervised manner. These representations are typically used as general…

Computation and Language · Computer Science 2018-04-03 Sandeep Subramanian , Adam Trischler , Yoshua Bengio , Christopher J Pal

Vector representation of sentences is important for many text processing tasks that involve clustering, classifying, or ranking sentences. Recently, distributed representation of sentences learned by neural models from unlabeled data has…

Computation and Language · Computer Science 2016-10-27 Tanay Kumar Saha , Shafiq Joty , Naeemul Hassan , Mohammad Al Hasan

Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models that learn such representations ignore the morphology of words, by assigning a distinct vector to each…

Computation and Language · Computer Science 2017-06-20 Piotr Bojanowski , Edouard Grave , Armand Joulin , Tomas Mikolov

We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the…

Semantic Similarity between two sentences can be defined as a way to determine how related or unrelated two sentences are. The task of Semantic Similarity in terms of distributed representations can be thought to be generating sentence…

Computation and Language · Computer Science 2017-10-24 Richa Sharma , Muktabh Mayank Srivastava

Computing universal distributed representations of sentences is a fundamental task in natural language processing. We propose ConsSent, a simple yet surprisingly powerful unsupervised method to learn such representations by enforcing…

Computation and Language · Computer Science 2019-01-25 Siddhartha Brahma

In this paper, we introduce a variation of the skip-gram model which jointly learns distributed word vector representations and their way of composing to form phrase embeddings. In particular, we propose a learning procedure that…

Computation and Language · Computer Science 2016-07-22 Xiaochang Peng , Daniel Gildea

This paper introduces a sentence to vector encoding framework suitable for advanced natural language processing. Our latent representation is shown to encode sentences with common semantic information with similar vector representations.…

Computation and Language · Computer Science 2018-09-30 Chi Zhang , Shagan Sah , Thang Nguyen , Dheeraj Peri , Alexander Loui , Carl Salvaggio , Raymond Ptucha

Distributed representations of sentences have become ubiquitous in natural language processing tasks. In this paper, we consider a continual learning scenario for sentence representations: Given a sequence of corpora, we aim to optimize the…

Machine Learning · Computer Science 2019-04-22 Tianlin Liu , Lyle Ungar , João Sedoc

Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems. The learned representations are generally assumed to be continuous and real-valued,…

Computation and Language · Computer Science 2019-06-21 Dinghan Shen , Pengyu Cheng , Dhanasekar Sundararaman , Xinyuan Zhang , Qian Yang , Meng Tang , Asli Celikyilmaz , Lawrence Carin

Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so…

Computation and Language · Computer Science 2018-07-10 Alexis Conneau , Douwe Kiela , Holger Schwenk , Loic Barrault , Antoine Bordes

Unsupervised text embeddings extraction is crucial for text understanding in machine learning. Word2Vec and its variants have received substantial success in mapping words with similar syntactic or semantic meaning to vectors close to each…

Computation and Language · Computer Science 2018-05-30 Furong Huang , Animashree Anandkumar

In this paper, we propose a novel deep neural network architecture, Sequence-to-Sequence Audio2Vec, for unsupervised learning of fixed-length vector representations of audio segments excised from a speech corpus, where the vectors contain…

Computation and Language · Computer Science 2017-11-07 Yu-An Chung , James Glass

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We…

Computation and Language · Computer Science 2018-12-31 Matteo Pagliardini , Prakhar Gupta , Martin Jaggi

The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we…

Computation and Language · Computer Science 2013-10-18 Tomas Mikolov , Ilya Sutskever , Kai Chen , Greg Corrado , Jeffrey Dean
‹ Prev 1 2 3 10 Next ›