Related papers: Training Machine Learning Models by Regularizing t…

Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations

Neural networks are among the most accurate supervised learning methods in use today, but their opacity makes them difficult to trust in critical applications, especially when conditions in training differ from those in test. Recent work on…

Machine Learning · Computer Science 2017-11-15 Andrew Slavin Ross , Michael C. Hughes , Finale Doshi-Velez

Learning to Scaffold: Optimizing Model Explanations for Teaching

Modern machine learning models are opaque, and as a result there is a burgeoning academic subfield on methods that explain these models' behavior. However, what is the precise goal of providing such explanations, and how can we demonstrate…

Machine Learning · Computer Science 2022-12-01 Patrick Fernandes , Marcos Treviso , Danish Pruthi , André F. T. Martins , Graham Neubig

Training Deep Models to be Explained with Fewer Examples

Although deep models achieve high predictive performance, it is difficult for humans to understand the predictions they made. Explainability is important for real-world applications to justify their reliability. Many example-based…

Machine Learning · Statistics 2021-12-08 Tomoharu Iwata , Yuya Yoshikawa

Explaining Predictions from Machine Learning Models: Algorithms, Users, and Pedagogy

Model explainability has become an important problem in machine learning (ML) due to the increased effect that algorithmic predictions have on humans. Explanations can help users understand not only why ML models make certain predictions,…

Machine Learning · Computer Science 2022-09-13 Ana Lucic

Saliency Learning: Teaching the Model Where to Pay Attention

Deep learning has emerged as a compelling solution to many NLP tasks with remarkable performances. However, due to their opacity, such models are hard to interpret and trust. Recent work on explaining deep models has introduced approaches…

Computation and Language · Computer Science 2019-05-21 Reza Ghaeini , Xiaoli Z. Fern , Hamed Shahbazi , Prasad Tadepalli

Opportunities and limitations of explaining quantum machine learning

A common trait of many machine learning models is that it is often difficult to understand and explain what caused the model to produce the given output. While the explainability of neural networks has been an active field of research in…

Quantum Physics · Physics 2024-12-20 Elies Gil-Fuster , Jonas R. Naujoks , Grégoire Montavon , Thomas Wiegand , Wojciech Samek , Jens Eisert

Evaluating Explanation Methods for Deep Learning in Security

Deep learning is increasingly used as a building block of security systems. Unfortunately, neural networks are hard to interpret and typically opaque to the practitioner. The machine learning community has started to address this problem by…

Machine Learning · Computer Science 2020-04-28 Alexander Warnecke , Daniel Arp , Christian Wressnegger , Konrad Rieck

Individual Explanations in Machine Learning Models: A Case Study on Poverty Estimation

Machine learning methods are being increasingly applied in sensitive societal contexts, where decisions impact human lives. Hence it has become necessary to build capabilities for providing easily-interpretable explanations of models'…

Machine Learning · Computer Science 2021-04-13 Alfredo Carrillo , Luis F. Cantú , Luis Tejerina , Alejandro Noriega

Explaining Deep Neural Networks

Deep neural networks are becoming more and more popular due to their revolutionary success in diverse areas, such as computer vision, natural language processing, and speech recognition. However, the decision-making processes of these…

Computation and Language · Computer Science 2021-10-15 Oana-Maria Camburu

The Intriguing Properties of Model Explanations

Linear approximations to the decision boundary of a complex model have become one of the most popular tools for interpreting predictions. In this paper, we study such linear explanations produced either post-hoc by a few recent methods or…

Machine Learning · Computer Science 2018-01-31 Maruan Al-Shedivat , Avinava Dubey , Eric P. Xing

Regularizing Explanations in Bayesian Convolutional Neural Networks

Neural networks are powerful function approximators with tremendous potential in learning complex distributions. However, they are prone to overfitting on spurious patterns. Bayesian inference provides a principled way to regularize neural…

Machine Learning · Computer Science 2024-12-02 Yanzhe Bekkemoen , Helge Langseth

Understanding Deep Neural Networks through Input Uncertainties

Techniques for understanding the functioning of complex machine learning models are becoming increasingly popular, not only to improve the validation process, but also to extract new insights about the data via exploratory analysis. Though…

Machine Learning · Statistics 2018-11-02 Jayaraman J. Thiagarajan , Irene Kim , Rushil Anirudh , Peer-Timo Bremer

A Unified Study of Machine Learning Explanation Evaluation Metrics

The growing need for trustworthy machine learning has led to the blossom of interpretability research. Numerous explanation methods have been developed to serve this purpose. However, these methods are deficiently and inappropriately…

Machine Learning · Computer Science 2022-03-29 Yipei Wang , Xiaoqian Wang

A Logic-Driven Framework for Consistency of Neural Models

While neural models show remarkable accuracy on individual predictions, their internal beliefs can be inconsistent across examples. In this paper, we formalize such inconsistency as a generalization of prediction error. We propose a…

Artificial Intelligence · Computer Science 2019-09-16 Tao Li , Vivek Gupta , Maitrey Mehta , Vivek Srikumar

On Trustworthy Rule-Based Models and Explanations

A task of interest in machine learning (ML) is that of ascribing explanations to the predictions made by ML models. Furthermore, in domains deemed high risk, the rigor of explanations is paramount. Indeed, incorrect explanations can and…

Artificial Intelligence · Computer Science 2025-07-11 Mohamed Siala , Jordi Planes , Joao Marques-Silva

Training Uncertainty-Aware Classifiers with Conformalized Deep Learning

Deep neural networks are powerful tools to detect hidden patterns in data and leverage them to make predictions, but they are not designed to understand uncertainty and estimate reliable probabilities. In particular, they tend to be…

Machine Learning · Statistics 2022-11-10 Bat-Sheva Einbinder , Yaniv Romano , Matteo Sesia , Yanfei Zhou

A constraints-based approach to fully interpretable neural networks for detecting learner behaviors

The increasing use of complex machine learning models in education has led to concerns about their interpretability, which in turn has spurred interest in developing explainability techniques that are both faithful to the model's inner…

Machine Learning · Computer Science 2025-05-13 Juan D. Pinto , Luc Paquette

Optimal Explanations of Linear Models

When predictive models are used to support complex and important decisions, the ability to explain a model's reasoning can increase trust, expose hidden biases, and reduce vulnerability to adversarial attacks. However, attempts at…

Machine Learning · Computer Science 2019-07-11 Dimitris Bertsimas , Arthur Delarue , Patrick Jaillet , Sebastien Martin

Towards Robust Interpretability with Self-Explaining Neural Networks

Most recent work on interpretability of complex machine learning models has focused on estimating $\textit{a posteriori}$ explanations for previously trained models around specific predictions. $\textit{Self-explaining}$ models where…

Machine Learning · Computer Science 2018-12-05 David Alvarez-Melis , Tommi S. Jaakkola

Towards Benchmarking Explainable Artificial Intelligence Methods

The currently dominating artificial intelligence and machine learning technology, neural networks, builds on inductive statistical learning. Neural networks of today are information processing systems void of understanding and reasoning…

Artificial Intelligence · Computer Science 2022-08-26 Lars Holmberg