Related papers: Assessing incrementality in sequence-to-sequence m…

Modelling Sentence Pairs with Tree-structured Attentive Encoder

We describe an attentive encoder that combines tree-structured recursive neural networks and sequential recurrent neural networks for modelling sentence pairs. Since existing attentive models exert attention on the sequential structure, we…

Computation and Language · Computer Science 2016-10-11 Yao Zhou , Cong Liu , Yan Pan

Iterative Recursive Attention Model for Interpretable Sequence Classification

Natural language processing has greatly benefited from the introduction of the attention mechanism. However, standard attention models are of limited interpretability for tasks that involve a series of inference steps. We describe an…

Computation and Language · Computer Science 2018-09-03 Martin Tutek , Jan Šnajder

Human Sentence Processing: Recurrence or Attention?

Recurrent neural networks (RNNs) have long been an architecture of interest for computational models of human sentence processing. The recently introduced Transformer architecture outperforms RNNs on many natural language processing tasks…

Computation and Language · Computer Science 2022-03-31 Danny Merkx , Stefan L. Frank

Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition

Encoder-decoder models have become an effective approach for sequence learning tasks like machine translation, image captioning and speech recognition, but have yet to show competitive results for handwritten text recognition. To this end,…

Computer Vision and Pattern Recognition · Computer Science 2019-07-16 Johannes Michael , Roger Labahn , Tobias Grüning , Jochen Zöllner

Attention with Intention for a Neural Network Conversation Model

In a conversation or a dialogue process, attention and intention play intrinsic roles. This paper proposes a neural network based approach that models the attention and intention processes. It essentially consists of three recurrent…

Neural and Evolutionary Computing · Computer Science 2015-11-06 Kaisheng Yao , Geoffrey Zweig , Baolin Peng

Focused Hierarchical RNNs for Conditional Sequence Processing

Recurrent Neural Networks (RNNs) with attention mechanisms have obtained state-of-the-art results for many sequence processing tasks. Most of these models use a simple form of encoder with attention that looks over the entire sequence and…

Machine Learning · Statistics 2018-06-13 Nan Rosemary Ke , Konrad Zolna , Alessandro Sordoni , Zhouhan Lin , Adam Trischler , Yoshua Bengio , Joelle Pineau , Laurent Charlin , Chris Pal

An Introductory Survey on Attention Mechanisms in NLP Problems

First derived from human intuition, later adapted to machine translation for automatic token alignment, attention mechanism, a simple method that can be used for encoding sequence data based on the importance score each element is assigned,…

Computation and Language · Computer Science 2018-11-15 Dichao Hu

Analysing the potential of seq-to-seq models for incremental interpretation in task-oriented dialogue

We investigate how encoder-decoder models trained on a synthetic dataset of task-oriented dialogues process disfluencies, such as hesitations and self-corrections. We find that, contrary to earlier results, disfluencies have very little…

Computation and Language · Computer Science 2018-08-29 Dieuwke Hupkes , Sanne Bouwmeester , Raquel Fernández

Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings

Since Bahdanau et al. [1] first introduced attention for neural machine translation, most sequence-to-sequence models made use of attention mechanisms [2, 3, 4]. While they produce soft-alignment matrices that could be interpreted as…

Computation and Language · Computer Science 2019-09-12 Marcely Zanon Boito , Aline Villavicencio , Laurent Besacier

Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing

Recently, encoder-decoder neural networks have shown impressive performance on many sequence-related tasks. The architecture commonly uses an attentional mechanism which allows the model to learn alignments between the source and the target…

Computation and Language · Computer Science 2017-11-06 Andros Tjandra , Sakriani Sakti , Satoshi Nakamura

Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model

A sequence-to-sequence model is a neural network module for mapping two sequences of different lengths. The sequence-to-sequence model has three core modules: encoder, decoder, and attention. Attention is the bridge that connects the…

Computation and Language · Computer Science 2018-07-24 Andros Tjandra , Sakriani Sakti , Satoshi Nakamura

Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU

While humans process language incrementally, the best language encoders currently used in NLP do not. Both bidirectional LSTMs and Transformers assume that the sequence that is to be encoded is available in full, to be processed either…

Computation and Language · Computer Science 2024-03-29 Brielen Madureira , David Schlangen

Speeding up Context-based Sentence Representation Learning with Non-autoregressive Convolutional Decoding

Context plays an important role in human language understanding, thus it may also be useful for machines learning vector representations of language. In this paper, we explore an asymmetric encoder-decoder structure for unsupervised…

Neural and Evolutionary Computing · Computer Science 2018-06-04 Shuai Tang , Hailin Jin , Chen Fang , Zhaowen Wang , Virginia R. de Sa

Exploring Neural Transducers for End-to-End Speech Recognition

In this work, we perform an empirical comparison among the CTC, RNN-Transducer, and attention-based Seq2Seq models for end-to-end speech recognition. We show that, without any language model, Seq2Seq and RNN-Transducer models both…

Computation and Language · Computer Science 2017-07-25 Eric Battenberg , Jitong Chen , Rewon Child , Adam Coates , Yashesh Gaur , Yi Li , Hairong Liu , Sanjeev Satheesh , David Seetapun , Anuroop Sriram , Zhenyao Zhu

Discovering Useful Sentence Representations from Large Pretrained Language Models

Despite the extensive success of pretrained language models as encoders for building NLP systems, they haven't seen prominence as decoders for sequence generation tasks. We explore the question of whether these models can be adapted to be…

Computation and Language · Computer Science 2020-08-21 Nishant Subramani , Nivedita Suresh

Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

Current state-of-the-art machine translation systems are based on encoder-decoder architectures, that first encode the input sequence, and then generate an output sequence based on the input encoding. Both are interfaced with an attention…

Computation and Language · Computer Science 2018-11-02 Maha Elbayad , Laurent Besacier , Jakob Verbeek

On the Properties of Neural Machine Translation: Encoder-Decoder Approaches

Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length…

Computation and Language · Computer Science 2014-10-08 Kyunghyun Cho , Bart van Merrienboer , Dzmitry Bahdanau , Yoshua Bengio

Reasoning about Entailment with Neural Attention

While most approaches to automatically recognizing entailment relations have used classifiers employing hand engineered features derived from complex natural language processing pipelines, in practice their performance has been only…

Computation and Language · Computer Science 2016-03-02 Tim Rocktäschel , Edward Grefenstette , Karl Moritz Hermann , Tomáš Kočiský , Phil Blunsom

Coherent Dialogue with Attention-based Language Models

We model coherent conversation continuation via RNN-based dialogue models equipped with a dynamic attention mechanism. Our attention-RNN language model dynamically increases the scope of attention on the history as the conversation…

Computation and Language · Computer Science 2016-11-22 Hongyuan Mei , Mohit Bansal , Matthew R. Walter

Temporal Attention Model for Neural Machine Translation

Attention-based Neural Machine Translation (NMT) models suffer from attention deficiency issues as has been observed in recent research. We propose a novel mechanism to address some of these limitations and improve the NMT attention.…

Computation and Language · Computer Science 2016-08-10 Baskaran Sankaran , Haitao Mi , Yaser Al-Onaizan , Abe Ittycheriah