Related papers: Describing Multimedia Content using Attention-base…

Understanding How Encoder-Decoder Architectures Attend

Encoder-decoder networks with attention have proven to be a powerful way to solve many sequence-to-sequence tasks. In these networks, attention aligns encoder and decoder states and is often used for visualizing network behavior. However,…

Machine Learning · Computer Science 2021-10-29 Kyle Aitken , Vinay V Ramasesh , Yuan Cao , Niru Maheswaranathan

Understanding attention-based encoder-decoder networks: a case study with chess scoresheet recognition

Deep neural networks are largely used for complex prediction tasks. There is plenty of empirical evidence of their successful end-to-end training for a diversity of tasks. Success is often measured based solely on the final performance of…

Computer Vision and Pattern Recognition · Computer Science 2024-06-12 Sergio Y. Hayashi , Nina S. T. Hirata

Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

Current state-of-the-art machine translation systems are based on encoder-decoder architectures, that first encode the input sequence, and then generate an output sequence based on the input encoding. Both are interfaced with an attention…

Computation and Language · Computer Science 2018-11-02 Maha Elbayad , Laurent Besacier , Jakob Verbeek

Attention mechanisms in neural networks

Attention mechanisms represent a fundamental paradigm shift in neural network architectures, enabling models to selectively focus on relevant portions of input sequences through learned weighting functions. This monograph provides a…

Machine Learning · Computer Science 2026-01-08 Hasi Hays

An Attention-Based Deep Net for Learning to Rank

In information retrieval, learning to rank constructs a machine-based ranking model which given a query, sorts the search results by their degree of relevance or importance to the query. Neural networks have been successfully applied to…

Machine Learning · Computer Science 2017-12-12 Baiyang Wang , Diego Klabjan

Survey on the attention based RNN model and its applications in computer vision

The recurrent neural networks (RNN) can be used to solve the sequence to sequence problem, where both the input and the output have sequential structures. Usually there are some implicit relations between the structures. However, it is hard…

Computer Vision and Pattern Recognition · Computer Science 2016-01-27 Feng Wang , David M. J. Tax

Modeling Latent Attention Within Neural Networks

Deep neural networks are able to solve tasks across a variety of domains and modalities of data. Despite many empirical successes, we lack the ability to clearly understand and interpret the learned internal mechanisms that contribute to…

Artificial Intelligence · Computer Science 2018-01-03 Christopher Grimm , Dilip Arumugam , Siddharth Karamcheti , David Abel , Lawson L. S. Wong , Michael L. Littman

Convolution, attention and structure embedding

Deep neural networks are composed of layers of parametrised linear operations intertwined with non linear activations. In basic models, such as the multi-layer perceptron, a linear layer operates on a simple input vector embedding of the…

Machine Learning · Computer Science 2020-03-06 Jean-Marc Andreoli

Attend and Guide (AG-Net): A Keypoints-driven Attention-based Deep Network for Image Recognition

This paper presents a novel keypoints-based attention mechanism for visual recognition in still images. Deep Convolutional Neural Networks (CNNs) for recognizing images with distinctive classes have shown great success, but their…

Computer Vision and Pattern Recognition · Computer Science 2021-10-26 Asish Bera , Zachary Wharton , Yonghuai Liu , Nik Bessis , Ardhendu Behera

Not All Attention Is Needed: Gated Attention Network for Sequence Data

Although deep neural networks generally have fixed network structures, the concept of dynamic mechanism has drawn more and more attention in recent years. Attention mechanisms compute input-dependent dynamic attention weights for…

Machine Learning · Computer Science 2019-12-03 Lanqing Xue , Xiaopeng Li , Nevin L. Zhang

Structured Attention Networks

Attention networks have proven to be an effective approach for embedding categorical inference within a deep neural network. However, for many tasks we may want to model richer structural dependencies without abandoning end-to-end training.…

Computation and Language · Computer Science 2017-02-17 Yoon Kim , Carl Denton , Luong Hoang , Alexander M. Rush

Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks

With the rising number of interconnected devices and sensors, modeling distributed sensor networks is of increasing interest. Recurrent neural networks (RNN) are considered particularly well suited for modeling sensory and streaming data.…

Machine Learning · Computer Science 2017-11-15 Stephan Baier , Sigurd Spieckermann , Volker Tresp

An Empirical Study of Spatial Attention Mechanisms in Deep Networks

Attention mechanisms have become a popular component in deep neural networks, yet there has been little examination of how different influencing factors and methods for computing attention from these factors affect performance. Toward a…

Computer Vision and Pattern Recognition · Computer Science 2019-04-15 Xizhou Zhu , Dazhi Cheng , Zheng Zhang , Stephen Lin , Jifeng Dai

Variational Structured Attention Networks for Deep Visual Representation Learning

Convolutional neural networks have enabled major progresses in addressing pixel-level prediction tasks such as semantic segmentation, depth estimation, surface normal prediction and so on, benefiting from their powerful capabilities in…

Computer Vision and Pattern Recognition · Computer Science 2021-12-16 Guanglei Yang , Paolo Rota , Xavier Alameda-Pineda , Dan Xu , Mingli Ding , Elisa Ricci

Machine Learning for Brain Disorders: Transformers and Visual Transformers

Transformers were initially introduced for natural language processing (NLP) tasks, but fast they were adopted by most deep learning fields, including computer vision. They measure the relationships between pairs of input tokens (words in…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Robin Courant , Maika Edberg , Nicolas Dufour , Vicky Kalogeiton

Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers

Transformers are increasingly dominating multi-modal reasoning tasks, such as visual question answering, achieving state-of-the-art results thanks to their ability to contextualize information using the self-attention and co-attention…

Computer Vision and Pattern Recognition · Computer Science 2021-03-30 Hila Chefer , Shir Gur , Lior Wolf

Neural Attention for Image Captioning: Review of Outstanding Methods

Image captioning is the task of automatically generating sentences that describe an input image in the best way possible. The most successful techniques for automatically generating image captions have recently used attentive deep learning…

Computer Vision and Pattern Recognition · Computer Science 2021-12-01 Zanyar Zohourianshahzadi , Jugal K. Kalita

Attention with Intention for a Neural Network Conversation Model

In a conversation or a dialogue process, attention and intention play intrinsic roles. This paper proposes a neural network based approach that models the attention and intention processes. It essentially consists of three recurrent…

Neural and Evolutionary Computing · Computer Science 2015-11-06 Kaisheng Yao , Geoffrey Zweig , Baolin Peng

On the Interpretability of Attention Networks

Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like…

Machine Learning · Computer Science 2023-05-16 Lakshmi Narayan Pandey , Rahul Vashisht , Harish G. Ramaswamy

Gated recurrent neural networks discover attention

Recent architectural developments have enabled recurrent neural networks (RNNs) to reach and even surpass the performance of Transformers on certain sequence modeling tasks. These modern RNNs feature a prominent design pattern: linear…

Machine Learning · Computer Science 2024-02-08 Nicolas Zucchet , Seijin Kobayashi , Yassir Akram , Johannes von Oswald , Maxime Larcher , Angelika Steger , João Sacramento