Related papers: Human-interpretable model explainability on high-d…

The Definitions of Interpretability and Learning of Interpretable Models

As machine learning algorithms getting adopted in an ever-increasing number of applications, interpretation has emerged as a crucial desideratum. In this paper, we propose a mathematical definition for the human-interpretable model. In…

Machine Learning · Computer Science 2021-06-01 Weishen Pan , Changshui Zhang

Learning Interpretable Concept-Based Models with Human Feedback

Machine learning models that first learn a representation of a domain in terms of human-understandable concepts, then use it to make predictions, have been proposed to facilitate interpretation and interaction with models trained on…

Machine Learning · Computer Science 2020-12-08 Isaac Lage , Finale Doshi-Velez

Techniques for Interpretable Machine Learning

Interpretable machine learning tackles the important problem that humans cannot understand the behaviors of complex machine learning models and how these models arrive at a particular decision. Although many approaches have been proposed, a…

Machine Learning · Computer Science 2019-05-21 Mengnan Du , Ninghao Liu , Xia Hu

Shapley explainability on the data manifold

Explainability in AI is crucial for model development, compliance with regulation, and providing operational nuance to predictions. The Shapley framework for explainability attributes a model's predictions to its input features in a…

Machine Learning · Computer Science 2021-12-21 Christopher Frye , Damien de Mijolla , Tom Begley , Laurence Cowton , Megan Stanley , Ilya Feige

MonoNet: Towards Interpretable Models by Learning Monotonic Features

Being able to interpret, or explain, the predictions made by a machine learning model is of fundamental importance. This is especially true when there is interest in deploying data-driven models to make high-stakes decisions, e.g. in…

Machine Learning · Computer Science 2019-10-01 An-phi Nguyen , María Rodríguez Martínez

Assessing the Local Interpretability of Machine Learning Models

The increasing adoption of machine learning tools has led to calls for accountability via model interpretability. But what does it mean for a machine learning model to be interpretable by humans, and how can this be assessed? We focus on…

Machine Learning · Computer Science 2019-08-06 Dylan Slack , Sorelle A. Friedler , Carlos Scheidegger , Chitradeep Dutta Roy

From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence

We take inspiration from the study of human explanation to inform the design and evaluation of interpretability methods in machine learning. First, we survey the literature on human explanation in philosophy, cognitive science, and the…

Artificial Intelligence · Computer Science 2021-09-21 David Alvarez-Melis , Harmanpreet Kaur , Hal Daumé , Hanna Wallach , Jennifer Wortman Vaughan

Interpretability of machine learning based prediction models in healthcare

There is a need of ensuring machine learning models that are interpretable. Higher interpretability of the model means easier comprehension and explanation of future predictions for end-users. Further, interpretable machine learning models…

Machine Learning · Computer Science 2020-08-17 Gregor Stiglic , Primoz Kocbek , Nino Fijacko , Marinka Zitnik , Katrien Verbert , Leona Cilar

The Mythos of Model Interpretability

Supervised machine learning models boast remarkable predictive capabilities. But can you trust your model? Will it work in deployment? What else can it tell you about the world? We want models to be not only good, but interpretable. And yet…

Machine Learning · Computer Science 2017-03-07 Zachary C. Lipton

Model-Agnostic Interpretability of Machine Learning

Understanding why machine learning models behave the way they do empowers both system designers and end-users in many ways: in model selection, feature engineering, in order to trust and act upon the predictions, and in more intuitive user…

Machine Learning · Statistics 2016-06-20 Marco Tulio Ribeiro , Sameer Singh , Carlos Guestrin

Explainability as statistical inference

A wide variety of model explanation approaches have been proposed in recent years, all guided by very different rationales and heuristics. In this paper, we take a new route and cast interpretability as a statistical inference problem. We…

Machine Learning · Computer Science 2024-01-01 Hugo Henri Joseph Senetaire , Damien Garreau , Jes Frellsen , Pierre-Alexandre Mattei

On The Stability of Interpretable Models

Interpretable classification models are built with the purpose of providing a comprehensible description of the decision logic to an external oversight agent. When considered in isolation, a decision tree, a set of classification rules, or…

Machine Learning · Computer Science 2019-03-18 Riccardo Guidotti , Salvatore Ruggieri

A Framework to Learn with Interpretation

To tackle interpretability in deep learning, we present a novel framework to jointly learn a predictive model and its associated interpretation model. The interpreter provides both local and global interpretability about the predictive…

Machine Learning · Computer Science 2022-02-24 Jayneel Parekh , Pavlo Mozharovskyi , Florence d'Alché-Buc

Interpretable Deep Learning: Interpretation, Interpretability, Trustworthiness, and Beyond

Deep neural networks have been well-known for their superb handling of various machine learning and artificial intelligence tasks. However, due to their over-parameterized black-box nature, it is often difficult to understand the prediction…

Machine Learning · Computer Science 2022-07-18 Xuhong Li , Haoyi Xiong , Xingjian Li , Xuanyu Wu , Xiao Zhang , Ji Liu , Jiang Bian , Dejing Dou

Model-Agnostic Interpretation Framework in Machine Learning: A Comparative Study in NBA Sports

The field of machine learning has seen tremendous progress in recent years, with deep learning models delivering exceptional performance across a range of tasks. However, these models often come at the cost of interpretability, as they…

Machine Learning · Computer Science 2024-01-08 Shun Liu

A constraints-based approach to fully interpretable neural networks for detecting learner behaviors

The increasing use of complex machine learning models in education has led to concerns about their interpretability, which in turn has spurred interest in developing explainability techniques that are both faithful to the model's inner…

Machine Learning · Computer Science 2025-05-13 Juan D. Pinto , Luc Paquette

Enhancing Interpretability for Vision Models via Shapley Value Optimization

Deep neural networks have demonstrated remarkable performance across various domains, yet their decision-making processes remain opaque. Although many explanation methods are dedicated to bringing the obscurity of DNNs to light, they…

Computer Vision and Pattern Recognition · Computer Science 2025-12-17 Kanglong Fan , Yunqiao Yang , Chen Ma

Explainability of Large Language Models: Opportunities and Challenges toward Generating Trustworthy Explanations

Large language models have exhibited impressive performance across a broad range of downstream tasks in natural language processing. However, how a language model predicts the next token and generates content is not generally understandable…

Computation and Language · Computer Science 2025-10-21 Shahin Atakishiyev , Housam K. B. Babiker , Jiayi Dai , Nawshad Farruque , Teruaki Hayashi , Nafisa Sadaf Hriti , Md Abed Rahman , Iain Smith , Mi-Young Kim , Osmar R. Zaïane , Randy Goebel

On the Semantic Interpretability of Artificial Intelligence Models

Artificial Intelligence models are becoming increasingly more powerful and accurate, supporting or even replacing humans' decision making. But with increased power and accuracy also comes higher complexity, making it hard for users to…

Artificial Intelligence · Computer Science 2019-07-10 Vivian S. Silva , André Freitas , Siegfried Handschuh

Interpretability with full complexity by constraining feature information

Interpretability is a pressing issue for machine learning. Common approaches to interpretable machine learning constrain interactions between features of the input, rendering the effects of those features on a model's output comprehensible…

Machine Learning · Computer Science 2023-05-11 Kieran A. Murphy , Dani S. Bassett