Related papers: Is Attention Interpretable?

Attention Interpretability Across NLP Tasks

The attention layer in a neural network model provides insights into the model's reasoning behind its prediction, which are usually criticized for being opaque. Recently, seemingly contradictory viewpoints have emerged about the…

Computation and Language · Computer Science 2019-09-26 Shikhar Vashishth , Shyam Upadhyay , Gaurav Singh Tomar , Manaal Faruqui

Attention is not Explanation

Attention mechanisms have seen wide adoption in neural NLP models. In addition to improving predictive performance, these are often touted as affording transparency: models equipped with attention provide a distribution over attended-to…

Computation and Language · Computer Science 2019-05-10 Sarthak Jain , Byron C. Wallace

Revisiting Attention Weights as Explanations from an Information Theoretic Perspective

Attention mechanisms have recently demonstrated impressive performance on a range of NLP tasks, and attention scores are often used as a proxy for model explainability. However, there is a debate on whether attention weights can, in fact,…

Computation and Language · Computer Science 2022-11-16 Bingyang Wen , K. P. Subbalakshmi , Fan Yang

Rethinking Self-Attention: Towards Interpretability in Neural Parsing

Attention mechanisms have improved the performance of NLP tasks while allowing models to remain explainable. Self-attention is currently widely used, however interpretability is difficult due to the numerous attention distributions. Recent…

Computation and Language · Computer Science 2020-10-30 Khalil Mrini , Franck Dernoncourt , Quan Tran , Trung Bui , Walter Chang , Ndapa Nakashole

Learning to Deceive with Attention-Based Explanations

Attention mechanisms are ubiquitous components in neural architectures applied to natural language processing. In addition to yielding gains in predictive accuracy, attention weights are often claimed to confer interpretability, purportedly…

Computation and Language · Computer Science 2020-04-08 Danish Pruthi , Mansi Gupta , Bhuwan Dhingra , Graham Neubig , Zachary C. Lipton

Why Attentions May Not Be Interpretable?

Attention-based methods have played important roles in model interpretations, where the calculated attention weights are expected to highlight the critical parts of inputs~(e.g., keywords in sentences). However, recent research found that…

Machine Learning · Statistics 2021-06-04 Bing Bai , Jian Liang , Guanhua Zhang , Hao Li , Kun Bai , Fei Wang

Is Attention Interpretation? A Quantitative Assessment On Sets

The debate around the interpretability of attention mechanisms is centered on whether attention scores can be used as a proxy for the relative amounts of signal carried by sub-components of data. We propose to study the interpretability of…

Machine Learning · Computer Science 2022-07-27 Jonathan Haab , Nicolas Deutschmann , Maria Rodríguez Martínez

Staying True to Your Word: (How) Can Attention Become Explanation?

The attention mechanism has quickly become ubiquitous in NLP. In addition to improving performance of models, attention has been widely used as a glimpse into the inner workings of NLP models. The latter aspect has in the recent years…

Computation and Language · Computer Science 2020-05-20 Martin Tutek , Jan Šnajder

Towards Transparent and Explainable Attention Models

Recent studies on interpretability of attention distributions have led to notions of faithful and plausible explanations for a model's predictions. Attention distributions can be considered a faithful explanation if a higher attention…

Computation and Language · Computer Science 2020-04-30 Akash Kumar Mohankumar , Preksha Nema , Sharan Narasimhan , Mitesh M. Khapra , Balaji Vasan Srinivasan , Balaraman Ravindran

An Introductory Survey on Attention Mechanisms in NLP Problems

First derived from human intuition, later adapted to machine translation for automatic token alignment, attention mechanism, a simple method that can be used for encoding sequence data based on the importance score each element is assigned,…

Computation and Language · Computer Science 2018-11-15 Dichao Hu

Attention is not not Explanation

Attention mechanisms play a central role in NLP systems, especially within recurrent neural network (RNN) models. Recently, there has been increasing interest in whether or not the intermediate representations offered by these modules may…

Computation and Language · Computer Science 2019-09-06 Sarah Wiegreffe , Yuval Pinter

Are Sixteen Heads Really Better than One?

Attention is a powerful and ubiquitous mechanism for allowing neural models to focus on particular salient pieces of information by taking their weighted average when making predictions. In particular, multi-headed attention is a driving…

Computation and Language · Computer Science 2019-11-05 Paul Michel , Omer Levy , Graham Neubig

More Identifiable yet Equally Performant Transformers for Text Classification

Interpretability is an important aspect of the trustworthiness of a model's predictions. Transformer's predictions are widely explained by the attention weights, i.e., a probability distribution generated at its self-attention unit (head).…

Computation and Language · Computer Science 2021-06-03 Rishabh Bhardwaj , Navonil Majumder , Soujanya Poria , Eduard Hovy

A Song of (Dis)agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing

There has been significant debate in the NLP community about whether or not attention weights can be used as an explanation - a mechanism for interpreting how important each input token is for a particular prediction. The validity of…

Computation and Language · Computer Science 2022-05-11 Michael Neely , Stefan F. Schouten , Maurits Bleeker , Ana Lucic

Rethinking Attention-Model Explainability through Faithfulness Violation Test

Attention mechanisms are dominating the explainability of deep models. They produce probability distributions over the input, which are widely deemed as feature-importance indicators. However, in this paper, we find one critical limitation…

Machine Learning · Computer Science 2022-07-06 Yibing Liu , Haoliang Li , Yangyang Guo , Chenqi Kong , Jing Li , Shiqi Wang

Is Sparse Attention more Interpretable?

Sparse attention has been claimed to increase model interpretability under the assumption that it highlights influential inputs. Yet the attention distribution is typically over representations internal to the model rather than the inputs…

Computation and Language · Computer Science 2021-06-09 Clara Meister , Stefan Lazov , Isabelle Augenstein , Ryan Cotterell

Adding Interpretable Attention to Neural Translation Models Improves Word Alignment

Multi-layer models with multiple attention heads per layer provide superior translation quality compared to simpler and shallower models, but determining what source context is most relevant to each target word is more challenging as a…

Computation and Language · Computer Science 2019-02-01 Thomas Zenkel , Joern Wuebker , John DeNero

Are Interpretations Fairly Evaluated? A Definition Driven Pipeline for Post-Hoc Interpretability

Recent years have witnessed an increasing number of interpretation methods being developed for improving transparency of NLP models. Meanwhile, researchers also try to answer the question that whether the obtained interpretation is faithful…

Computation and Language · Computer Science 2020-09-17 Ninghao Liu , Yunsong Meng , Xia Hu , Tie Wang , Bo Long

Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification

Neural network architectures in natural language processing often use attention mechanisms to produce probability distributions over input token representations. Attention has empirically been demonstrated to improve performance in various…

Computation and Language · Computer Science 2021-05-10 George Chrysostomou , Nikolaos Aletras

Interrogating the Explanatory Power of Attention in Neural Machine Translation

Attention models have become a crucial component in neural machine translation (NMT). They are often implicitly or explicitly used to justify the model's decision in generating a specific token but it has not yet been rigorously established…

Computation and Language · Computer Science 2019-10-02 Pooya Moradi , Nishant Kambhatla , Anoop Sarkar