Related papers: Focused Hierarchical RNNs for Conditional Sequence…

Not All Attention Is Needed: Gated Attention Network for Sequence Data

Although deep neural networks generally have fixed network structures, the concept of dynamic mechanism has drawn more and more attention in recent years. Attention mechanisms compute input-dependent dynamic attention weights for…

Machine Learning · Computer Science 2019-12-03 Lanqing Xue , Xiaopeng Li , Nevin L. Zhang

Attention-based Neural Load Forecasting: A Dynamic Feature Selection Approach

Encoder-decoder-based recurrent neural network (RNN) has made significant progress in sequence-to-sequence learning tasks such as machine translation and conversational models. Recent works have shown the advantage of this type of network…

Machine Learning · Computer Science 2023-05-10 Jing Xiong , Pengyang Zhou , Alan Chen , Yu Zhang

Understanding How Encoder-Decoder Architectures Attend

Encoder-decoder networks with attention have proven to be a powerful way to solve many sequence-to-sequence tasks. In these networks, attention aligns encoder and decoder states and is often used for visualizing network behavior. However,…

Machine Learning · Computer Science 2021-10-29 Kyle Aitken , Vinay V Ramasesh , Yuan Cao , Niru Maheswaranathan

Recurrent Neural Network Encoder with Attention for Community Question Answering

We apply a general recurrent neural network (RNN) encoder framework to community question answering (cQA) tasks. Our approach does not rely on any linguistic processing, and can be applied to different languages or domains. Further…

Computation and Language · Computer Science 2016-03-24 Wei-Ning Hsu , Yu Zhang , James Glass

Survey on the attention based RNN model and its applications in computer vision

The recurrent neural networks (RNN) can be used to solve the sequence to sequence problem, where both the input and the output have sequential structures. Usually there are some implicit relations between the structures. However, it is hard…

Computer Vision and Pattern Recognition · Computer Science 2016-01-27 Feng Wang , David M. J. Tax

Recurrently Controlled Recurrent Networks

Recurrent neural networks (RNNs) such as long short-term memory and gated recurrent units are pivotal building blocks across a broad spectrum of sequence modeling problems. This paper proposes a recurrently controlled recurrent network…

Computation and Language · Computer Science 2018-11-27 Yi Tay , Luu Anh Tuan , Siu Cheung Hui

Modelling Sentence Pairs with Tree-structured Attentive Encoder

We describe an attentive encoder that combines tree-structured recursive neural networks and sequential recurrent neural networks for modelling sentence pairs. Since existing attentive models exert attention on the sequential structure, we…

Computation and Language · Computer Science 2016-10-11 Yao Zhou , Cong Liu , Yan Pan

Gated recurrent neural networks discover attention

Recent architectural developments have enabled recurrent neural networks (RNNs) to reach and even surpass the performance of Transformers on certain sequence modeling tasks. These modern RNNs feature a prominent design pattern: linear…

Machine Learning · Computer Science 2024-02-08 Nicolas Zucchet , Seijin Kobayashi , Yassir Akram , Johannes von Oswald , Maxime Larcher , Angelika Steger , João Sacramento

Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units

Recurrent neural network (RNN) has been widely studied in sequence learning tasks, while the mainstream models (e.g., LSTM and GRU) rely on the gating mechanism (in control of how information flows between hidden states). However, the…

Computer Vision and Pattern Recognition · Computer Science 2020-05-27 Zhanzhan Cheng , Yunlu Xu , Mingjian Cheng , Yu Qiao , Shiliang Pu , Yi Niu , Fei Wu

Describing Multimedia Content using Attention-based Encoder--Decoder Networks

Whereas deep neural networks were first mostly used for classification tasks, they are rapidly expanding in the realm of structured output problems, where the observed target is composed of multiple random variables that have a rich joint…

Neural and Evolutionary Computing · Computer Science 2016-11-15 Kyunghyun Cho , Aaron Courville , Yoshua Bengio

Scene Parsing via Dense Recurrent Neural Networks with Attentional Selection

Recurrent neural networks (RNNs) have shown the ability to improve scene parsing through capturing long-range dependencies among image units. In this paper, we propose dense RNNs for scene labeling by exploring various long-range semantic…

Computer Vision and Pattern Recognition · Computer Science 2018-11-13 Heng Fan , Peng Chu , Longin Jan Latecki , Haibin Ling

Contextually Structured Token Dependency Encoding for Large Language Models

Token representation strategies within large-scale neural architectures often rely on contextually refined embeddings, yet conventional approaches seldom encode structured relationships explicitly within token interactions. Self-attention…

Computation and Language · Computer Science 2025-03-27 James Blades , Frederick Somerfield , William Langley , Susan Everingham , Maurice Witherington

Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition

Encoder-decoder models have become an effective approach for sequence learning tasks like machine translation, image captioning and speech recognition, but have yet to show competitive results for handwritten text recognition. To this end,…

Computer Vision and Pattern Recognition · Computer Science 2019-07-16 Johannes Michael , Roger Labahn , Tobias Grüning , Jochen Zöllner

Gated Recurrent Neural Tensor Network

Recurrent Neural Networks (RNNs), which are a powerful scheme for modeling temporal and sequential data need to capture long-term dependencies on datasets and represent them in hidden layers with a powerful model to capture more information…

Machine Learning · Computer Science 2017-06-08 Andros Tjandra , Sakriani Sakti , Ruli Manurung , Mirna Adriani , Satoshi Nakamura

Review Networks for Caption Generation

We propose a novel extension of the encoder-decoder framework, called a review network. The review network is generic and can enhance any existing encoder- decoder model: in this paper, we consider RNN decoders with both CNN and RNN…

Machine Learning · Computer Science 2016-10-28 Zhilin Yang , Ye Yuan , Yuexin Wu , Ruslan Salakhutdinov , William W. Cohen

Assessing incrementality in sequence-to-sequence models

Since their inception, encoder-decoder models have successfully been applied to a wide array of problems in computational linguistics. The most recent successes are predominantly due to the use of different variations of attention…

Computation and Language · Computer Science 2019-06-11 Dennis Ulmer , Dieuwke Hupkes , Elia Bruni

Randomized Deep Structured Prediction for Discourse-Level Processing

Expressive text encoders such as RNNs and Transformer Networks have been at the center of NLP models in recent work. Most of the effort has focused on sentence-level tasks, capturing the dependencies between words in a single sentence, or…

Computation and Language · Computer Science 2021-09-15 Manuel Widmoser , Maria Leonor Pacheco , Jean Honorio , Dan Goldwasser

Understanding Hidden Memories of Recurrent Neural Networks

Recurrent neural networks (RNNs) have been successfully applied to various natural language processing (NLP) tasks and achieved better results than conventional methods. However, the lack of understanding of the mechanisms behind their…

Computation and Language · Computer Science 2017-10-31 Yao Ming , Shaozu Cao , Ruixiang Zhang , Zhen Li , Yuanzhe Chen , Yangqiu Song , Huamin Qu

Conditional Channel Gated Networks for Task-Aware Continual Learning

Convolutional Neural Networks experience catastrophic forgetting when optimized on a sequence of learning problems: as they meet the objective of the current training examples, their performance on previous tasks drops drastically. In this…

Computer Vision and Pattern Recognition · Computer Science 2020-04-02 Davide Abati , Jakub Tomczak , Tijmen Blankevoort , Simone Calderara , Rita Cucchiara , Babak Ehteshami Bejnordi

Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

Current state-of-the-art machine translation systems are based on encoder-decoder architectures, that first encode the input sequence, and then generate an output sequence based on the input encoding. Both are interfaced with an attention…

Computation and Language · Computer Science 2018-11-02 Maha Elbayad , Laurent Besacier , Jakob Verbeek