Related papers: Regularizing Black-box Models for Improved Interpr…

Regularizing Black-box Models for Improved Interpretability

Most of the work on interpretable machine learning has focused on designing either inherently interpretable models, which typically trade-off accuracy for interpretability, or post-hoc explanation systems, whose explanation quality can be…

Machine Learning · Computer Science 2020-11-10 Gregory Plumb , Maruan Al-Shedivat , Angel Alexander Cabrera , Adam Perer , Eric Xing , Ameet Talwalkar

An Evaluation of the Human-Interpretability of Explanation

Recent years have seen a boom in interest in machine learning systems that can provide a human-understandable rationale for their predictions or decisions. However, exactly what kinds of explanation are truly human-interpretable remains…

Machine Learning · Computer Science 2019-08-30 Isaac Lage , Emily Chen , Jeffrey He , Menaka Narayanan , Been Kim , Sam Gershman , Finale Doshi-Velez

Investigating the Duality of Interpretability and Explainability in Machine Learning

The rapid evolution of machine learning (ML) has led to the widespread adoption of complex "black box" models, such as deep neural networks and ensemble methods. These models exhibit exceptional predictive performance, making them…

Machine Learning · Computer Science 2025-03-28 Moncef Garouani , Josiane Mothe , Ayah Barhrhouj , Julien Aligon

Optimizing for Interpretability in Deep Neural Networks with Tree Regularization

Deep models have advanced prediction in many domains, but their lack of interpretability remains a key barrier to the adoption in many real world applications. There exists a large body of work aiming to help humans understand these black…

Machine Learning · Computer Science 2019-08-15 Mike Wu , Sonali Parbhoo , Michael C. Hughes , Volker Roth , Finale Doshi-Velez

Lifting Interpretability-Performance Trade-off via Automated Feature Engineering

Complex black-box predictive models may have high performance, but lack of interpretability causes problems like lack of trust, lack of stability, sensitivity to concept drift. On the other hand, achieving satisfactory accuracy of…

Machine Learning · Computer Science 2020-02-12 Alicja Gosiewska , Przemyslaw Biecek

Evaluating Explanation Without Ground Truth in Interpretable Machine Learning

Interpretable Machine Learning (IML) has become increasingly important in many real-world applications, such as autonomous cars and medical diagnosis, where explanations are significantly preferred to help people better understand how…

Machine Learning · Computer Science 2019-08-19 Fan Yang , Mengnan Du , Xia Hu

Towards Robust Interpretability with Self-Explaining Neural Networks

Most recent work on interpretability of complex machine learning models has focused on estimating $\textit{a posteriori}$ explanations for previously trained models around specific predictions. $\textit{Self-explaining}$ models where…

Machine Learning · Computer Science 2018-12-05 David Alvarez-Melis , Tommi S. Jaakkola

Interpretable Companions for Black-Box Models

We present an interpretable companion model for any pre-trained black-box classifiers. The idea is that for any input, a user can decide to either receive a prediction from the black-box model, with high accuracy but no explanations, or…

Machine Learning · Statistics 2020-02-12 Danqing Pan , Tong Wang , Satoshi Hara

Local Interpretations for Explainable Natural Language Processing: A Survey

As the use of deep learning techniques has grown across various fields over the past decade, complaints about the opaqueness of the black-box models have increased, resulting in an increased focus on transparency in deep learning models.…

Computation and Language · Computer Science 2024-03-19 Siwen Luo , Hamish Ivison , Caren Han , Josiah Poon

Demystifying the Accuracy-Interpretability Trade-Off: A Case Study of Inferring Ratings from Reviews

Interpretable machine learning models offer understandable reasoning behind their decision-making process, though they may not always match the performance of their black-box counterparts. This trade-off between interpretability and model…

Artificial Intelligence · Computer Science 2025-03-12 Pranjal Atrey , Michael P. Brundage , Min Wu , Sanghamitra Dutta

Interpretable and Explainable Machine Learning Methods for Predictive Process Monitoring: A Systematic Literature Review

This paper presents a systematic literature review (SLR) on the explainability and interpretability of machine learning (ML) models within the context of predictive process mining, using the PRISMA framework. Given the rapid advancement of…

Machine Learning · Computer Science 2024-01-01 Nijat Mehdiyev , Maxim Majlatow , Peter Fettke

The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

Machine learning models in safety-critical settings like healthcare are often blackboxes: they contain a large number of parameters which are not transparent to users. Post-hoc explainability methods where a simple, human-interpretable…

Machine Learning · Computer Science 2022-06-03 Aparna Balagopalan , Haoran Zhang , Kimia Hamidieh , Thomas Hartvigsen , Frank Rudzicz , Marzyeh Ghassemi

Hybrid Predictive Model: When an Interpretable Model Collaborates with a Black-box Model

Interpretable machine learning has become a strong competitor for traditional black-box models. However, the possible loss of the predictive performance for gaining interpretability is often inevitable, putting practitioners in a dilemma of…

Machine Learning · Computer Science 2019-05-13 Tong Wang , Qihang Lin

An interpretable neural network model through piecewise linear approximation

Most existing interpretable methods explain a black-box model in a post-hoc manner, which uses simpler models or data analysis techniques to interpret the predictions after the model is learned. However, they (a) may derive contradictory…

Machine Learning · Computer Science 2020-01-22 Mengzhuo Guo , Qingpeng Zhang , Xiuwu Liao , Daniel Dajun Zeng

Model Interpretability and Rationale Extraction by Input Mask Optimization

Concurrent to the rapid progress in the development of neural-network based models in areas like natural language processing and computer vision, the need for creating explanations for the predictions of these black-box models has risen…

Computation and Language · Computer Science 2025-08-18 Marc Brinner , Sina Zarriess

On the Relationship Between Interpretability and Explainability in Machine Learning

Interpretability and explainability have gained more and more attention in the field of machine learning as they are crucial when it comes to high-stakes decisions and troubleshooting. Since both provide information about predictors and…

Machine Learning · Computer Science 2024-04-26 Benjamin Leblanc , Pascal Germain

Shedding Light on Black Box Machine Learning Algorithms: Development of an Axiomatic Framework to Assess the Quality of Methods that Explain Individual Predictions

From self-driving vehicles and back-flipping robots to virtual assistants who book our next appointment at the hair salon or at that restaurant for dinner - machine learning systems are becoming increasingly ubiquitous. The main reason for…

Machine Learning · Computer Science 2018-08-16 Milo Honegger

A Framework for Inherently Interpretable Optimization Models

With dramatic improvements in optimization software, the solution of large-scale problems that seemed intractable decades ago are now a routine task. This puts even more real-world applications into the reach of optimizers. At the same…

Optimization and Control · Mathematics 2023-03-07 Marc Goerigk , Michael Hartisch

What Makes a Good Explanation?: A Harmonized View of Properties of Explanations

Interpretability provides a means for humans to verify aspects of machine learning (ML) models and empower human+ML teaming in situations where the task cannot be fully automated. Different contexts require explanations with different…

Machine Learning · Computer Science 2024-07-15 Zixi Chen , Varshini Subhash , Marton Havasi , Weiwei Pan , Finale Doshi-Velez

Manipulating and Measuring Model Interpretability

With machine learning models being increasingly used to aid decision making even in high-stakes domains, there has been a growing interest in developing interpretable models. Although many supposedly interpretable models have been proposed,…

Artificial Intelligence · Computer Science 2021-08-17 Forough Poursabzi-Sangdeh , Daniel G. Goldstein , Jake M. Hofman , Jennifer Wortman Vaughan , Hanna Wallach