Related papers: Transformer Interpretability Beyond Attention Visu…

More Identifiable yet Equally Performant Transformers for Text Classification

Interpretability is an important aspect of the trustworthiness of a model's predictions. Transformer's predictions are widely explained by the attention weights, i.e., a probability distribution generated at its self-attention unit (head).…

Computation and Language · Computer Science 2021-06-03 Rishabh Bhardwaj , Navonil Majumder , Soujanya Poria , Eduard Hovy

A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis

We present a novel usage of Transformers to make image classification interpretable. Unlike mainstream classifiers that wait until the last fully connected layer to incorporate class information to make predictions, we investigate a…

Computer Vision and Pattern Recognition · Computer Science 2024-06-17 Dipanjyoti Paul , Arpita Chowdhury , Xinqi Xiong , Feng-Ju Chang , David Carlyn , Samuel Stevens , Kaiya L. Provost , Anuj Karpatne , Bryan Carstens , Daniel Rubenstein , Charles Stewart , Tanya Berger-Wolf , Yu Su , Wei-Lun Chao

Discriminating Spatial and Temporal Relevance in Deep Taylor Decompositions for Explainable Activity Recognition

Current techniques for explainable AI have been applied with some success to image processing. The recent rise of research in video processing has called for similar work n deconstructing and explaining spatio-temporal models. While many…

Machine Learning · Computer Science 2019-08-15 Liam Hiley , Alun Preece , Yulia Hicks , David Marshall , Harrison Taylor

Neural network interpretability with layer-wise relevance propagation: novel techniques for neuron selection and visualization

Interpreting complex neural networks is crucial for understanding their decision-making processes, particularly in applications where transparency and accountability are essential. This proposed method addresses this need by focusing on…

Neural and Evolutionary Computing · Computer Science 2024-12-10 Deepshikha Bhati , Fnu Neha , Md Amiruzzaman , Angela Guercio , Deepak Kumar Shukla , Ben Ward

Mechanistic Interpretability for Transformer-based Time Series Classification

Transformer-based models have become state-of-the-art tools in various machine learning tasks, including time series classification, yet their complexity makes understanding their internal decision-making challenging. Existing…

Machine Learning · Computer Science 2025-11-27 Matīss Kalnāre , Sofoklis Kitharidis , Thomas Bäck , Niki van Stein

Evolving Attention with Residual Convolutions

Transformer is a ubiquitous model for natural language processing and has attracted wide attentions in computer vision. The attention maps are indispensable for a transformer model to encode the dependencies among input tokens. However,…

Machine Learning · Computer Science 2021-02-26 Yujing Wang , Yaming Yang , Jiangang Bai , Mingliang Zhang , Jing Bai , Jing Yu , Ce Zhang , Gao Huang , Yunhai Tong

Transformer-Based Attention Networks for Continuous Pixel-Wise Prediction

While convolutional neural networks have shown a tremendous impact on various computer vision tasks, they generally demonstrate limitations in explicitly modeling long-range dependencies due to the intrinsic locality of the convolution…

Computer Vision and Pattern Recognition · Computer Science 2021-08-06 Guanglei Yang , Hao Tang , Mingli Ding , Nicu Sebe , Elisa Ricci

Decision-Aware Attention Propagation for Vision Transformer Explainability

Vision Transformers (ViTs) have become a dominant architecture in computer vision, yet their prediction process remains difficult to interpret because information is propagated through complex interactions across layers and attention heads.…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Sehyeong Jo , Gangjae Jang , Haesol Park

Unlocking Layer-wise Relevance Propagation for Autoencoders

Autoencoders are a powerful and versatile tool often used for various problems such as anomaly detection, image processing and machine translation. However, their reconstructions are not always trivial to explain. Therefore, we propose a…

Machine Learning · Computer Science 2023-03-22 Kenyu Kobayashi , Renata Khasanova , Arno Schneuwly , Felix Schmidt , Matteo Casserini

Less is More: Pay Less Attention in Vision Transformers

Transformers have become one of the dominant architectures in deep learning, particularly as a powerful alternative to convolutional neural networks (CNNs) in computer vision. However, Transformer training and inference in previous works…

Computer Vision and Pattern Recognition · Computer Science 2021-12-24 Zizheng Pan , Bohan Zhuang , Haoyu He , Jing Liu , Jianfei Cai

Better Explain Transformers by Illuminating Important Information

Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings. Prior methods explain Transformers by focusing on the raw gradient and attention as token…

Computation and Language · Computer Science 2024-01-29 Linxin Song , Yan Cui , Ao Luo , Freddy Lecue , Irene Li

Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability

The development of effective explainability tools for Transformers is a crucial pursuit in deep learning research. One of the most promising approaches in this domain is Layer-wise Relevance Propagation (LRP), which propagates relevance…

Machine Learning · Computer Science 2025-06-04 Yarden Bakish , Itamar Zimerman , Hila Chefer , Lior Wolf

Learn to Rank: Visual Attribution by Learning Importance Ranking

Interpreting the decisions of complex computer vision models is crucial to establish trust and accountability, especially in safety-critical domains. An established approach to interpretability is generating visual attribution maps that…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 David Schinagl , Christian Fruhwirth-Reisinger , Alexander Prutsch , Samuel Schulter , Horst Possegger

An Attention Matrix for Every Decision: Faithfulness-based Arbitration Among Multiple Attention-Based Interpretations of Transformers in Text Classification

Transformers are widely used in natural language processing, where they consistently achieve state-of-the-art performance. This is mainly due to their attention-based architecture, which allows them to model rich linguistic relations…

Computation and Language · Computer Science 2022-11-29 Nikolaos Mylonas , Ioannis Mollas , Grigorios Tsoumakas

Transformer-F: A Transformer network with effective methods for learning universal sentence representation

The Transformer model is widely used in natural language processing for sentence representation. However, the previous Transformer-based models focus on function words that have limited meaning in most cases and could merely extract…

Computation and Language · Computer Science 2021-07-05 Yu Shi

Towards Evaluating Explanations of Vision Transformers for Medical Imaging

As deep learning models increasingly find applications in critical domains such as medical imaging, the need for transparent and trustworthy decision-making becomes paramount. Many explainability methods provide insights into how these…

Computer Vision and Pattern Recognition · Computer Science 2023-11-09 Piotr Komorowski , Hubert Baniecki , Przemysław Biecek

Vision Transformer with Deformable Attention

Transformers have recently shown superior performances on various vision tasks. The large, sometimes even global, receptive field endows Transformer models with higher representation power over their CNN counterparts. Nevertheless, simply…

Computer Vision and Pattern Recognition · Computer Science 2022-05-25 Zhuofan Xia , Xuran Pan , Shiji Song , Li Erran Li , Gao Huang

Revisiting Transformers with Insights from Image Filtering and Boosting

The self-attention mechanism, a cornerstone of Transformer-based state-of-the-art deep learning architectures, is largely heuristic-driven and fundamentally challenging to interpret. Establishing a robust theoretical foundation to explain…

Computer Vision and Pattern Recognition · Computer Science 2026-02-10 Laziz U. Abdullaev , Maksim Tkachenko , Tan M. Nguyen

Interplay Between Belief Propagation and Transformer: Differential-Attention Message Passing Transformer

Transformer-based neural decoders have emerged as a promising approach to error correction coding, combining data-driven adaptability with efficient modeling of long-range dependencies. This paper presents a novel decoder architecture that…

Information Theory · Computer Science 2025-09-22 Chin Wa Lau , Xiang Shi , Ziyan Zheng , Haiwen Cao , Nian Guo

Self-Attention Attribution: Interpreting Information Interactions Inside Transformer

The great success of Transformer-based models benefits from the powerful multi-head self-attention mechanism, which learns token dependencies and encodes contextual information from the input. Prior work strives to attribute model decisions…

Computation and Language · Computer Science 2021-02-26 Yaru Hao , Li Dong , Furu Wei , Ke Xu