English
Related papers

Related papers: Learning Explanations from Language Data

200 papers

Deep neural networks are often considered opaque systems, prompting the need for explainability methods to improve trust and accountability. Existing approaches typically attribute test-time predictions either to input features (e.g.,…

Computer Vision and Pattern Recognition · Computer Science 2025-10-13 Aziz Bacha , Thomas George

We present a method for neural network interpretability by combining feature attribution with counterfactual explanations to generate attribution maps that highlight the most discriminative features between pairs of classes. We show that…

Machine Learning · Computer Science 2021-09-29 Nils Eckstein , Alexander S. Bates , Gregory S. X. E. Jefferis , Jan Funke

DeConvNet, Guided BackProp, LRP, were invented to better understand deep neural networks. We show that these methods do not produce the theoretically correct explanation for a linear model. Yet they are used on multi-layer networks with…

Motivated by distinct, though related, criteria, a growing number of attribution methods have been developed tointerprete deep learning. While each relies on the interpretability of the concept of "importance" and our ability to visualize…

Artificial Intelligence · Computer Science 2020-04-07 Zifan Wang , Piotr Mardziel , Anupam Datta , Matt Fredrikson

Feature attribution methods, which explain an individual prediction made by a model as a sum of attributions for each input feature, are an essential tool for understanding the behavior of complex deep learning models. However, ensuring…

Machine Learning · Computer Science 2020-10-28 Ethan Weinberger , Joseph Janizek , Su-In Lee

Most techniques for explainable machine learning focus on feature attribution, i.e., values are assigned to the features such that their sum equals the prediction. Example attribution is another form of explanation that assigns weights to…

Machine Learning · Computer Science 2025-02-28 Genghua Dong , Henrik Boström , Michalis Vazirgiannis , Roman Bresson

Deep Recurrent Neural Network (RNN) has gained popularity in many sequence classification tasks. Beyond predicting a correct class for each data instance, data scientists also want to understand what differentiating factors in the data have…

Machine Learning · Computer Science 2019-01-18 Chuan Wang , Takeshi Onishi , Keiichi Nemoto , Kwan-Liu Ma

Interpretability is crucial for machine learning algorithms in high-stakes medical applications. However, high-performing neural networks typically cannot explain their predictions. Post-hoc explanation methods provide a way to understand…

Computer Vision and Pattern Recognition · Computer Science 2025-11-14 Susu Sun , Stefano Woerner , Andreas Maier , Lisa M. Koch , Christian F. Baumgartner

Explaining recommendations enables users to understand whether recommended items are relevant to their needs and has been shown to increase their trust in the system. More generally, if designing explainable machine learning models is key…

Machine Learning · Computer Science 2020-08-27 Darius Afchar , Romain Hennequin

The increasing complexity of AI systems has made understanding their behavior critical. Numerous interpretability methods have been developed to attribute model behavior to three key aspects: input features, training data, and internal…

Machine Learning · Computer Science 2025-05-30 Shichang Zhang , Tessa Han , Usha Bhalla , Himabindu Lakkaraju

Ensuring the trustworthiness and interpretability of machine learning models is critical to their deployment in real-world applications. Feature attribution methods have gained significant attention, which provide local explanations of…

Machine Learning · Computer Science 2023-09-20 Md Abdul Kadir , Gowtham Krishna Addluri , Daniel Sonntag

Image attribution analysis seeks to highlight the feature representations learned by visual models such that the highlighted feature maps can reflect the pixel-wise importance of inputs. Gradient integration is a building block in the…

Computer Vision and Pattern Recognition · Computer Science 2025-06-26 Róisín Luo , James McDermott , Colm O'Riordan

Attribution explanation is a typical approach for explaining deep neural networks (DNNs), inferring an importance or contribution score for each input variable to the final output. In recent years, numerous attribution methods have been…

Machine Learning · Computer Science 2025-08-12 Huiqi Deng , Hongbin Pei , Quanshi Zhang , Mengnan Du

Conventionally, AI models are thought to trade off explainability for lower accuracy. We develop a training strategy that not only leads to a more explainable AI system for object classification, but as a consequence, suffers no perceptible…

Computer Vision and Pattern Recognition · Computer Science 2020-03-17 Andrea Zunino , Sarah Adel Bargal , Riccardo Volpi , Mehrnoosh Sameki , Jianming Zhang , Stan Sclaroff , Vittorio Murino , Kate Saenko

Feature attribution explains neural network outputs by identifying relevant input features. The attribution has to be faithful, meaning that the attributed features must mirror the input features that influence the output. One recent trend…

Machine Learning · Computer Science 2024-02-15 Yang Zhang , Yawei Li , Hannah Brown , Mina Rezaei , Bernd Bischl , Philip Torr , Ashkan Khakzar , Kenji Kawaguchi

Attribution maps are one of the most established tools to explain the functioning of computer vision models. They assign importance scores to input features, indicating how relevant each feature is for the prediction of a deep neural…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Robin Hesse , Simone Schaub-Meyer , Stefan Roth

Explainable AI aims to render model behavior understandable by humans, which can be seen as an intermediate step in extracting causal relations from correlative patterns. Due to the high risk of possible fatal decisions in image-based…

Computer Vision and Pattern Recognition · Computer Science 2023-06-16 Lukas Klein , João B. S. Carvalho , Mennatallah El-Assady , Paolo Penna , Joachim M. Buhmann , Paul F. Jaeger

By now there is substantial evidence that deep learning models learn certain human-interpretable features as part of their internal representations of data. As having the right (or wrong) concepts is critical to trustworthy machine learning…

Machine Learning · Computer Science 2023-12-29 Nicholas Konz , Charles Godfrey , Madelyn Shapiro , Jonathan Tu , Henry Kvinge , Davis Brown

A basic assumption of statistical learning theory is that train and test data are drawn from the same underlying distribution. Unfortunately, this assumption doesn't hold in many applications. Instead, ample labeled data might exist in a…

Computer Vision and Pattern Recognition · Computer Science 2012-11-21 Oscar Beijbom

Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks. In contrast, humans can explain their predictions using succinct and intuitive descriptions. To incorporate explainability…

Computer Vision and Pattern Recognition · Computer Science 2023-07-04 Khalid Saifullah , Yuxin Wen , Jonas Geiping , Micah Goldblum , Tom Goldstein
‹ Prev 1 2 3 10 Next ›