Related papers: Efficient Purely Convolutional Text Encoding

Contextualized Spoken Word Representations from Convolutional Autoencoders

A lot of work has been done to build text-based language models for performing different NLP tasks, but not much research has been done in the case of audio-based language models. This paper proposes a Convolutional Autoencoder based neural…

Computation and Language · Computer Science 2020-09-30 Prakamya Mishra , Pranav Mathur

Deconvolutional Paragraph Representation Learning

Learning latent representations from long text sequences is an important first step in many natural language processing applications. Recurrent Neural Networks (RNNs) have become a cornerstone for this challenging task. However, the quality…

Computation and Language · Computer Science 2017-09-25 Yizhe Zhang , Dinghan Shen , Guoyin Wang , Zhe Gan , Ricardo Henao , Lawrence Carin

Byte-Level Recursive Convolutional Auto-Encoder for Text

This article proposes to auto-encode text at byte-level using convolutional networks with a recursive architecture. The motivation is to explore whether it is possible to have scalable and homogeneous text generation at byte-level in a…

Computation and Language · Computer Science 2018-02-07 Xiang Zhang , Yann LeCun

Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks

There is a lot of research interest in encoding variable length sentences into fixed length vectors, in a way that preserves the sentence meanings. Two common methods include representations based on averaging word vectors, and…

Computation and Language · Computer Science 2017-02-10 Yossi Adi , Einat Kermany , Yonatan Belinkov , Ofer Lavi , Yoav Goldberg

Semantic Sentence Embeddings for Paraphrasing and Text Summarization

This paper introduces a sentence to vector encoding framework suitable for advanced natural language processing. Our latent representation is shown to encode sentences with common semantic information with similar vector representations.…

Computation and Language · Computer Science 2018-09-30 Chi Zhang , Shagan Sah , Thang Nguyen , Dheeraj Peri , Alexander Loui , Carl Salvaggio , Raymond Ptucha

Consistent Alignment of Word Embedding Models

Word embedding models offer continuous vector representations that can capture rich contextual semantics based on their word co-occurrence patterns. While these word vectors can provide very effective features used in many NLP tasks such as…

Computation and Language · Computer Science 2017-02-27 Cem Safak Sahin , Rajmonda S. Caceres , Brandon Oselio , William M. Campbell

Speeding up Context-based Sentence Representation Learning with Non-autoregressive Convolutional Decoding

Context plays an important role in human language understanding, thus it may also be useful for machines learning vector representations of language. In this paper, we explore an asymmetric encoder-decoder structure for unsupervised…

Neural and Evolutionary Computing · Computer Science 2018-06-04 Shuai Tang , Hailin Jin , Chen Fang , Zhaowen Wang , Virginia R. de Sa

Text Simplification with Sentence Embeddings

Sentence embeddings can be decoded to give approximations of the original texts used to create them. We explore this effect in the context of text simplification, demonstrating that reconstructed text embeddings preserve complexity levels.…

Computation and Language · Computer Science 2025-10-29 Matthew Shardlow

Analyzing Structures in the Semantic Vector Space: A Framework for Decomposing Word Embeddings

Word embeddings are rich word representations, which in combination with deep neural networks, lead to large performance gains for many NLP tasks. However, word embeddings are represented by dense, real-valued vectors and they are therefore…

Computation and Language · Computer Science 2019-12-24 Andreas Hanselowski , Iryna Gurevych

Learning Compressed Sentence Representations for On-Device Text Processing

Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems. The learned representations are generally assumed to be continuous and real-valued,…

Computation and Language · Computer Science 2019-06-21 Dinghan Shen , Pengyu Cheng , Dhanasekar Sundararaman , Xinyuan Zhang , Qian Yang , Meng Tang , Asli Celikyilmaz , Lawrence Carin

Sentence transition matrix: An efficient approach that preserves sentence semantics

Sentence embedding is a significant research topic in the field of natural language processing (NLP). Generating sentence embedding vectors reflecting the intrinsic meaning of a sentence is a key factor to achieve an enhanced performance in…

Computation and Language · Computer Science 2019-01-17 Myeongjun Jang , Pilsung Kang

Context-Dependent Word Representation for Neural Machine Translation

We first observe a potential weakness of continuous vector representations of symbols in neural machine translation. That is, the continuous vector representation, or a word embedding vector, of a symbol encodes multiple dimensions of…

Computation and Language · Computer Science 2016-07-05 Heeyoul Choi , Kyunghyun Cho , Yoshua Bengio

Bag-of-Vector Embeddings of Dependency Graphs for Semantic Induction

Vector-space models, from word embeddings to neural network parsers, have many advantages for NLP. But how to generalise from fixed-length word vectors to a vector space for arbitrary linguistic structures is still unclear. In this paper we…

Computation and Language · Computer Science 2017-10-03 Diana Nicoleta Popa , James Henderson

Semantic Vector Machines

We first present our work in machine translation, during which we used aligned sentences to train a neural network to embed n-grams of different languages into an $d$-dimensional space, such that n-grams that are the translation of each…

Machine Learning · Computer Science 2011-05-17 Etter Vincent

Tensorized Embedding Layers for Efficient Model Compression

The embedding layers transforming input words into real vectors are the key components of deep neural networks used in natural language processing. However, when the vocabulary is large, the corresponding weight matrices can be enormous,…

Computation and Language · Computer Science 2020-02-20 Oleksii Hrinchuk , Valentin Khrulkov , Leyla Mirvakhabova , Elena Orlova , Ivan Oseledets

Multiple Word Embeddings for Increased Diversity of Representation

Most state-of-the-art models in natural language processing (NLP) are neural models built on top of large, pre-trained, contextual language models that generate representations of words in context and are fine-tuned for the task at hand.…

Computation and Language · Computer Science 2020-10-13 Brian Lester , Daniel Pressel , Amy Hemmeter , Sagnik Ray Choudhury , Srinivas Bangalore

A Hybrid Convolutional Variational Autoencoder for Text Generation

In this paper we explore the effect of architectural choices on learning a Variational Autoencoder (VAE) for text generation. In contrast to the previously introduced VAE model for text where both the encoder and decoder are RNNs, we…

Computation and Language · Computer Science 2017-02-09 Stanislau Semeniuta , Aliaksei Severyn , Erhardt Barth

Learning Sentence Embeddings for Coherence Modelling and Beyond

We present a novel and effective technique for performing text coherence tasks while facilitating deeper insights into the data. Despite obtaining ever-increasing task performance, modern deep-learning approaches to NLP tasks often only…

Computation and Language · Computer Science 2019-08-09 Tanner Bohn , Yining Hu , Jinhang Zhang , Charles X. Ling

Near-lossless Binarization of Word Embeddings

Word embeddings are commonly used as a starting point in many NLP models to achieve state-of-the-art performances. However, with a large vocabulary and many dimensions, these floating-point representations are expensive both in terms of…

Computation and Language · Computer Science 2020-01-23 Julien Tissier , Christophe Gravier , Amaury Habrard

Nugget: Neural Agglomerative Embeddings of Text

Embedding text sequences is a widespread requirement in modern language understanding. Existing approaches focus largely on constant-size representations. This is problematic, as the amount of information contained in text often varies with…

Computation and Language · Computer Science 2023-10-04 Guanghui Qin , Benjamin Van Durme