Related papers: Iterative Recursive Attention Model for Interpreta…

Neural Machine Translation with Recurrent Attention Modeling

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future. We improve upon the attention model of Bahdanau et…

Neural and Evolutionary Computing · Computer Science 2016-07-19 Zichao Yang , Zhiting Hu , Yuntian Deng , Chris Dyer , Alex Smola

Assessing incrementality in sequence-to-sequence models

Since their inception, encoder-decoder models have successfully been applied to a wide array of problems in computational linguistics. The most recent successes are predominantly due to the use of different variations of attention…

Computation and Language · Computer Science 2019-06-11 Dennis Ulmer , Dieuwke Hupkes , Elia Bruni

Is Attention Interpretable?

Attention mechanisms have recently boosted performance on a range of NLP tasks. Because attention layers explicitly weight input components' representations, it is also often assumed that attention can be used to identify information that…

Computation and Language · Computer Science 2019-06-11 Sofia Serrano , Noah A. Smith

An Introductory Survey on Attention Mechanisms in NLP Problems

First derived from human intuition, later adapted to machine translation for automatic token alignment, attention mechanism, a simple method that can be used for encoding sequence data based on the importance score each element is assigned,…

Computation and Language · Computer Science 2018-11-15 Dichao Hu

Unsupervised Learning of Explainable Parse Trees for Improved Generalisation

Recursive neural networks (RvNN) have been shown useful for learning sentence representations and helped achieve competitive performance on several natural language inference tasks. However, recent RvNN-based models fail to learn simple…

Computation and Language · Computer Science 2021-04-13 Atul Sahay , Ayush Maheshwari , Ritesh Kumar , Ganesh Ramakrishnan , Manjesh Kumar Hanawal , Kavi Arya

Structural Attention Neural Networks for improved sentiment analysis

We introduce a tree-structured attention neural network for sentences and small phrases and apply it to the problem of sentiment classification. Our model expands the current recursive models by incorporating structural information around a…

Computation and Language · Computer Science 2017-01-10 Filippos Kokkinos , Alexandros Potamianos

Evolving Attention with Residual Convolutions

Transformer is a ubiquitous model for natural language processing and has attracted wide attentions in computer vision. The attention maps are indispensable for a transformer model to encode the dependencies among input tokens. However,…

Machine Learning · Computer Science 2021-02-26 Yujing Wang , Yaming Yang , Jiangang Bai , Mingliang Zhang , Jing Bai , Jing Yu , Ce Zhang , Gao Huang , Yunhai Tong

Iterative Alternating Neural Attention for Machine Reading

We propose a novel neural attention architecture to tackle machine comprehension tasks, such as answering Cloze-style queries with respect to a document. Unlike previous models, we do not collapse the query into a single vector, instead we…

Computation and Language · Computer Science 2016-11-10 Alessandro Sordoni , Philip Bachman , Adam Trischler , Yoshua Bengio

Building Interpretable Models for Business Process Prediction using Shared and Specialised Attention Mechanisms

In this paper, we address the "black-box" problem in predictive process analytics by building interpretable models that are capable to inform both what and why is a prediction. Predictive process analytics is a newly emerged discipline…

Machine Learning · Computer Science 2022-04-26 Bemali Wickramanayake , Zhipeng He , Chun Ouyang , Catarina Moreira , Yue Xu , Renuka Sindhgatta

Explainability of Text Processing and Retrieval Methods: A Survey

Deep Learning and Machine Learning based models have become extremely popular in text processing and information retrieval. However, the non-linear structures present inside the networks make these models largely inscrutable. A significant…

Information Retrieval · Computer Science 2026-03-12 Sourav Saha , Debapriyo Majumdar , Mandar Mitra

Attention in Natural Language Processing

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview…

Computation and Language · Computer Science 2021-10-12 Andrea Galassi , Marco Lippi , Paolo Torroni

Recurrence-Aware Long-Term Cognitive Network for Explainable Pattern Classification

Machine learning solutions for pattern classification problems are nowadays widely deployed in society and industry. However, the lack of transparency and accountability of most accurate models often hinders their safe use. Thus, there is a…

Machine Learning · Computer Science 2021-12-24 Gonzalo Nápoles , Yamisleydi Salgueiro , Isel Grau , Maikel Leon Espinosa

An Incremental Iterated Response Model of Pragmatics

Recent Iterated Response (IR) models of pragmatics conceptualize language use as a recursive process in which agents reason about each other to increase communicative efficiency. These models are generally defined over complete utterances.…

Computation and Language · Computer Science 2018-10-23 Reuben Cohn-Gordon , Noah D. Goodman , Christopher Potts

Rigorous Interpretation Is a Form of Evaluation

Current machine learning models are evaluated through behavioral snapshots, with benchmark accuracies, win rates and outcome-based metrics. Model explanations and evaluations, however, are fundamentally intertwined: understanding why a…

Computers and Society · Computer Science 2026-05-08 Isabelle Lee , Emmy Liu , Cathy Jiao , Brihi Joshi , Dani Yogatama , Fazl Barez , Michael Saxon

Iteratively Prompt Pre-trained Language Models for Chain of Thought

While Pre-trained Language Models (PLMs) internalize a great amount of world knowledge, they have been shown incapable of recalling these knowledge to solve tasks requiring complex & multi-step reasoning. Similar to how humans develop a…

Computation and Language · Computer Science 2022-10-25 Boshi Wang , Xiang Deng , Huan Sun

Is Attention Interpretation? A Quantitative Assessment On Sets

The debate around the interpretability of attention mechanisms is centered on whether attention scores can be used as a proxy for the relative amounts of signal carried by sub-components of data. We propose to study the interpretability of…

Machine Learning · Computer Science 2022-07-27 Jonathan Haab , Nicolas Deutschmann , Maria Rodríguez Martínez

Recursive Parameter Estimation: Convergence

We consider estimation procedures which are recursive in the sense that each successive estimator is obtained from the previous one by a simple adjustment. We propose a wide class of recursive estimation procedures for the general…

Statistics Theory · Mathematics 2007-05-23 Teo Sharia

Pre-training Attention Mechanisms

Recurrent neural networks with differentiable attention mechanisms have had success in generative and classification tasks. We show that the classification performance of such models can be enhanced by guiding a randomly initialized model…

Machine Learning · Computer Science 2017-12-18 Jack Lindsey

An Iterative Associative Memory Model for Empathetic Response Generation

Empathetic response generation aims to comprehend the cognitive and emotional states in dialogue utterances and generate proper responses. Psychological theories posit that comprehending emotional and cognitive states necessitates…

Computation and Language · Computer Science 2024-06-04 Zhou Yang , Zhaochun Ren , Yufeng Wang , Chao Chen , Haizhou Sun , Xiaofei Zhu , Xiangwen Liao

An Iterative Contextualization Algorithm with Second-Order Attention

Combining the representations of the words that make up a sentence into a cohesive whole is difficult, since it needs to account for the order of words, and to establish how the words present relate to each other. The solution we propose…

Computation and Language · Computer Science 2021-03-04 Diego Maupomé , Marie-Jean Meurs