English
Related papers

Related papers: Do Explanations Explain? Model Knows Best

200 papers

Explanation methods aim to make neural networks more trustworthy and interpretable. In this paper, we demonstrate a property of explanation methods which is disconcerting for both of these purposes. Namely, we show that explanations can be…

Linear approximations to the decision boundary of a complex model have become one of the most popular tools for interpreting predictions. In this paper, we study such linear explanations produced either post-hoc by a few recent methods or…

Machine Learning · Computer Science 2018-01-31 Maruan Al-Shedivat , Avinava Dubey , Eric P. Xing

For AI systems to garner widespread public acceptance, we must develop methods capable of explaining the decisions of black-box models such as neural networks. In this work, we identify two issues of current explanatory methods. First, we…

Computation and Language · Computer Science 2019-12-06 Oana-Maria Camburu , Eleonora Giunchiglia , Jakob Foerster , Thomas Lukasiewicz , Phil Blunsom

While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated. In this work, we introduce a framework to quantify the value of…

Explanations of model behavior are commonly evaluated via proxy properties weakly tied to the purposes explanations serve in practice. We contribute a decision theoretic framework that treats explanations as information signals valued by…

Artificial Intelligence · Computer Science 2026-02-24 Ziyang Guo , Berk Ustun , Jessica Hullman

We regard explanations as a blending of the input sample and the model's output and offer a few definitions that capture various desired properties of the function that generates these explanations. We study the links between these…

Machine Learning · Computer Science 2020-01-16 Lior Wolf , Tomer Galanti , Tamir Hazan

The challenge of creating interpretable models has been taken up by two main research communities: ML researchers primarily focused on lower-level explainability methods that suit the needs of engineers, and HCI researchers who have more…

Machine Learning · Computer Science 2024-07-16 Juan D. Pinto , Luc Paquette

Neural networks are among the most accurate supervised learning methods in use today. However, their opacity makes them difficult to trust in critical applications, especially when conditions in training may differ from those in practice.…

Machine Learning · Computer Science 2018-10-03 Andrew Slavin Ross

Recent legislative regulations have underlined the need for accountable and transparent artificial intelligence systems and have contributed to a growing interest in the Explainable Artificial Intelligence (XAI) field. Nonetheless, the lack…

Machine Learning · Computer Science 2025-10-14 Ilaria Vascotto , Alex Rodriguez , Alessandro Bonaita , Luca Bortolussi

As machine learning models are increasingly considered for high-stakes domains, effective explanation methods are crucial to ensure that their prediction strategies are transparent to the user. Over the years, numerous metrics have been…

Machine Learning · Computer Science 2025-04-14 Johannes Maeß , Grégoire Montavon , Shinichi Nakajima , Klaus-Robert Müller , Thomas Schnake

Despite the recent progress in deep neural networks (DNNs), it remains challenging to explain the predictions made by DNNs. Existing explanation methods for DNNs mainly focus on post-hoc explanations where another explanatory model is…

Machine Learning · Computer Science 2024-01-04 Wei Qian , Chenxu Zhao , Yangyi Li , Fenglong Ma , Chao Zhang , Mengdi Huai

Despite a growing literature on explaining neural networks, no consensus has been reached on how to explain a neural network decision or how to evaluate an explanation. Our contributions in this paper are twofold. First, we investigate…

Machine Learning · Computer Science 2020-03-23 Laura Rieger , Lars Kai Hansen

When explaining the decisions of deep neural networks, simple stories are tempting but dangerous. Especially in computer vision, the most popular explanation approaches give a false sense of comprehension to its users and provide an overly…

Machine Learning · Computer Science 2021-09-17 Matthias Kirchler , Martin Graf , Marius Kloft , Christoph Lippert

Supervised machine learning models boast remarkable predictive capabilities. But can you trust your model? Will it work in deployment? What else can it tell you about the world? We want models to be not only good, but interpretable. And yet…

Machine Learning · Computer Science 2017-03-07 Zachary C. Lipton

In order to ensure the reliability of the explanations of machine learning models, it is crucial to establish their advantages and limits and in which case each of these methods outperform. However, the current understanding of when and how…

Machine Learning · Computer Science 2025-02-12 Célia Wafa Ayad , Thomas Bonnier , Benjamin Bosch , Sonali Parbhoo , Jesse Read

With the continue development of Convolutional Neural Networks (CNNs), there is a growing concern regarding representations that they encode internally. Analyzing these internal representations is referred to as model interpretation. While…

Computer Vision and Pattern Recognition · Computer Science 2023-05-18 Hamed Behzadi-Khormouji , José Oramas

The ability of to explain neural network decisions goes hand in hand with their safe deployment. Several methods have been proposed to highlight features important for a given network decision. However, there is no consensus on how to…

Computer Vision and Pattern Recognition · Computer Science 2020-03-23 Agnieszka Grabska-Barwińska

We take inspiration from the study of human explanation to inform the design and evaluation of interpretability methods in machine learning. First, we survey the literature on human explanation in philosophy, cognitive science, and the…

Artificial Intelligence · Computer Science 2021-09-21 David Alvarez-Melis , Harmanpreet Kaur , Hal Daumé , Hanna Wallach , Jennifer Wortman Vaughan

While deep neural networks have achieved remarkable performance, they tend to lack transparency in prediction. The pursuit of greater interpretability in neural networks often results in a degradation of their original performance. Some…

Computer Vision and Pattern Recognition · Computer Science 2024-08-09 Hefeng Wu , Hao Jiang , Keze Wang , Ziyi Tang , Xianghuan He , Liang Lin

Explanation methods have emerged as an important tool to highlight the features responsible for the predictions of neural networks. There is mounting evidence that many explanation methods are rather unreliable and susceptible to malicious…

Computation and Language · Computer Science 2022-06-27 Shriya Atmakuri , Tejas Chheda , Dinesh Kandula , Nishant Yadav , Taesung Lee , Hessel Tuinhof
‹ Prev 1 2 3 10 Next ›