Related papers: An efficient framework for learning sentence repre…

Speeding up Context-based Sentence Representation Learning with Non-autoregressive Convolutional Decoding

Context plays an important role in human language understanding, thus it may also be useful for machines learning vector representations of language. In this paper, we explore an asymmetric encoder-decoder structure for unsupervised…

Neural and Evolutionary Computing · Computer Science 2018-06-04 Shuai Tang , Hailin Jin , Chen Fang , Zhaowen Wang , Virginia R. de Sa

Pixel Sentence Representation Learning

Pretrained language models are long known to be subpar in capturing sentence and document-level semantics. Though heavily investigated, transferring perturbation-based methods from unsupervised visual representation learning to NLP remains…

Computation and Language · Computer Science 2024-02-14 Chenghao Xiao , Zhuoxu Huang , Danlu Chen , G Thomas Hudson , Yizhi Li , Haoran Duan , Chenghua Lin , Jie Fu , Jungong Han , Noura Al Moubayed

Sentence Ordering and Coherence Modeling using Recurrent Neural Networks

Modeling the structure of coherent texts is a key NLP problem. The task of coherently organizing a given set of sentences has been commonly used to build and evaluate models that understand such structure. We propose an end-to-end…

Computation and Language · Computer Science 2017-12-25 Lajanugen Logeswaran , Honglak Lee , Dragomir Radev

Learning Distributed Representations of Sentences from Unlabelled Data

Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This…

Computation and Language · Computer Science 2016-02-11 Felix Hill , Kyunghyun Cho , Anna Korhonen

Discourse-Based Objectives for Fast Unsupervised Sentence Representation Learning

This work presents a novel objective function for the unsupervised training of neural network sentence encoders. It exploits signals from paragraph-level discourse coherence to train these models to understand text. Our objective is purely…

Computation and Language · Computer Science 2017-05-02 Yacine Jernite , Samuel R. Bowman , David Sontag

Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings

Semantic representation learning for sentences is an important and well-studied problem in NLP. The current trend for this task involves training a Transformer-based sentence encoder through a contrastive objective with text, i.e.,…

Computation and Language · Computer Science 2022-09-21 Yiren Jian , Chongyang Gao , Soroush Vosoughi

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on large amounts of text in an unsupervised manner. These representations are typically used as general…

Computation and Language · Computer Science 2018-04-03 Sandeep Subramanian , Adam Trischler , Yoshua Bengio , Christopher J Pal

Learning Robust, Transferable Sentence Representations for Text Classification

Despite deep recurrent neural networks (RNNs) demonstrate strong performance in text classification, training RNN models are often expensive and requires an extensive collection of annotated data which may not be available. To overcome the…

Computation and Language · Computer Science 2018-10-02 Wasi Uddin Ahmad , Xueying Bai , Nanyun Peng , Kai-Wei Chang

Unsupervised Learning of Sentence Representations Using Sequence Consistency

Computing universal distributed representations of sentences is a fundamental task in natural language processing. We propose ConsSent, a simple yet surprisingly powerful unsupervised method to learn such representations by enforcing…

Computation and Language · Computer Science 2019-01-25 Siddhartha Brahma

GLOSS: Generative Latent Optimization of Sentence Representations

We propose a method to learn unsupervised sentence representations in a non-compositional manner based on Generative Latent Optimization. Our approach does not impose any assumptions on how words are to be combined into a sentence…

Computation and Language · Computer Science 2019-08-14 Sidak Pal Singh , Angela Fan , Michael Auli

Learning Generic Sentence Representations Using Convolutional Neural Networks

We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. The model is learned by using a convolutional neural network as an encoder to map an input sentence into a…

Computation and Language · Computer Science 2017-07-28 Zhe Gan , Yunchen Pu , Ricardo Henao , Chunyuan Li , Xiaodong He , Lawrence Carin

SLM: Learning a Discourse Language Representation with Sentence Unshuffling

We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation in a fully self-supervised manner. Recent pre-training methods in NLP focus on learning either bottom or top-level…

Computation and Language · Computer Science 2020-11-02 Haejun Lee , Drew A. Hudson , Kangwook Lee , Christopher D. Manning

What do you learn from context? Probing for sentence structure in contextualized word representations

Contextualized representation models such as ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2018) have recently achieved state-of-the-art results on a diverse array of downstream NLP tasks. Building on recent token-level probing work,…

Computation and Language · Computer Science 2019-05-16 Ian Tenney , Patrick Xia , Berlin Chen , Alex Wang , Adam Poliak , R Thomas McCoy , Najoung Kim , Benjamin Van Durme , Samuel R. Bowman , Dipanjan Das , Ellie Pavlick

A Theoretical Analysis of Contrastive Unsupervised Representation Learning

Recent empirical works have successfully used unlabeled data to learn feature representations that are broadly useful in downstream classification tasks. Several of these methods are reminiscent of the well-known word2vec embedding…

Machine Learning · Computer Science 2019-02-26 Sanjeev Arora , Hrishikesh Khandeparkar , Mikhail Khodak , Orestis Plevrakis , Nikunj Saunshi

Dis-S2V: Discourse Informed Sen2Vec

Vector representation of sentences is important for many text processing tasks that involve clustering, classifying, or ranking sentences. Recently, distributed representation of sentences learned by neural models from unlabeled data has…

Computation and Language · Computer Science 2016-10-27 Tanay Kumar Saha , Shafiq Joty , Naeemul Hassan , Mohammad Al Hasan

Deconvolutional Paragraph Representation Learning

Learning latent representations from long text sequences is an important first step in many natural language processing applications. Recurrent Neural Networks (RNNs) have become a cornerstone for this challenging task. However, the quality…

Computation and Language · Computer Science 2017-09-25 Yizhe Zhang , Dinghan Shen , Guoyin Wang , Zhe Gan , Ricardo Henao , Lawrence Carin

Continual Learning for Sentence Representations Using Conceptors

Distributed representations of sentences have become ubiquitous in natural language processing tasks. In this paper, we consider a continual learning scenario for sentence representations: Given a sequence of corpora, we aim to optimize the…

Machine Learning · Computer Science 2019-04-22 Tianlin Liu , Lyle Ungar , João Sedoc

A Classification Approach to Word Prediction

The eventual goal of a language model is to accurately predict the value of a missing word given its context. We present an approach to word prediction that is based on learning a representation for each word as a function of words and…

Computation and Language · Computer Science 2007-05-23 Yair Even-Zohar , Dan Roth

A Comprehensive Survey of Sentence Representations: From the BERT Epoch to the ChatGPT Era and Beyond

Sentence representations are a critical component in NLP applications such as retrieval, question answering, and text classification. They capture the meaning of a sentence, enabling machines to understand and reason over human language. In…

Computation and Language · Computer Science 2024-02-05 Abhinav Ramesh Kashyap , Thanh-Tung Nguyen , Viktor Schlegel , Stefan Winkler , See-Kiong Ng , Soujanya Poria

Unsupervised Sentence Representations as Word Information Series: Revisiting TF--IDF

Sentence representation at the semantic level is a challenging task for Natural Language Processing and Artificial Intelligence. Despite the advances in word embeddings (i.e. word vector representations), capturing sentence meaning is an…

Computation and Language · Computer Science 2017-10-23 Ignacio Arroyo-Fernández , Carlos-Francisco Méndez-Cruz , Gerardo Sierra , Juan-Manuel Torres-Moreno , Grigori Sidorov