Related papers: Explaining Deep Learning Models with Constrained A…

Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review

Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of…

Machine Learning · Computer Science 2022-11-17 Sahil Verma , Varich Boonsanong , Minh Hoang , Keegan E. Hines , John P. Dickerson , Chirag Shah

Convex optimization for actionable \& plausible counterfactual explanations

Transparency is an essential requirement of machine learning based decision making systems that are deployed in real world. Often, transparency of a given system is achieved by providing explanations of the behavior and predictions of the…

Machine Learning · Computer Science 2021-05-18 André Artelt , Barbara Hammer

Generating Counterfactual Explanations Using Cardinality Constraints

Providing explanations about how machine learning algorithms work and/or make particular predictions is one of the main tools that can be used to improve their trusworthiness, fairness and robustness. Among the most intuitive type of…

Machine Learning · Computer Science 2024-04-12 Rubén Ruiz-Torrubiano

Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples

The last decade has witnessed the proliferation of Deep Learning models in many applications, achieving unrivaled levels of predictive performance. Unfortunately, the black-box nature of Deep Learning models has posed unanswered questions…

Machine Learning · Computer Science 2020-03-26 Alejandro Barredo-Arrieta , Javier Del Ser

Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations

Post-hoc explanations of machine learning models are crucial for people to understand and act on algorithmic predictions. An intriguing class of explanations is through counterfactuals, hypothetical examples that show people how to obtain a…

Machine Learning · Computer Science 2019-12-09 Ramaravind Kommiya Mothilal , Amit Sharma , Chenhao Tan

Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis

As machine learning (ML) models become more widely deployed in high-stakes applications, counterfactual explanations have emerged as key tools for providing actionable model explanations in practice. Despite the growing popularity of…

Machine Learning · Computer Science 2022-12-16 Martin Pawelczyk , Chirag Agarwal , Shalmali Joshi , Sohini Upadhyay , Himabindu Lakkaraju

Counterfactual Instances Explain Little

In many applications, it is important to be able to explain the decisions of machine learning systems. An increasingly popular approach has been to seek to provide \emph{counterfactual instance explanations}. These specify close possible…

Artificial Intelligence · Computer Science 2021-09-22 Adam White , Artur d'Avila Garcez

Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations

Interpretable machine learning seeks to understand the reasoning process of complex black-box systems that are long notorious for lack of explainability. One flourishing approach is through counterfactual explanations, which provide…

Artificial Intelligence · Computer Science 2023-06-02 Vy Vo , Trung Le , Van Nguyen , He Zhao , Edwin Bonilla , Gholamreza Haffari , Dinh Phung

Counterfactual Explanations for Clustering Models

Clustering algorithms rely on complex optimisation processes that may be difficult to comprehend, especially for individuals who lack technical expertise. While many explainable artificial intelligence techniques exist for supervised…

Machine Learning · Computer Science 2024-09-20 Aurora Spagnol , Kacper Sokol , Pietro Barbiero , Marc Langheinrich , Martin Gjoreski

Counterfactual explanation of machine learning survival models

A method for counterfactual explanation of machine learning survival models is proposed. One of the difficulties of solving the counterfactual explanation problem is that the classes of examples are implicitly defined through outcomes of a…

Machine Learning · Computer Science 2020-07-01 Maxim S. Kovalev , Lev V. Utkin

Learning with Explanation Constraints

As larger deep learning models are hard to interpret, there has been a recent focus on generating explanations of these black-box models. In contrast, we may have apriori explanations of how models should behave. In this paper, we formalize…

Machine Learning · Computer Science 2023-12-27 Rattana Pukdee , Dylan Sam , J. Zico Kolter , Maria-Florina Balcan , Pradeep Ravikumar

Interpretations are useful: penalizing explanations to align neural networks with prior knowledge

For an explanation of a deep learning model to be effective, it must provide both insight into a model and suggest a corresponding action in order to achieve some objective. Too often, the litany of proposed explainable deep learning…

Machine Learning · Computer Science 2020-10-09 Laura Rieger , Chandan Singh , W. James Murdoch , Bin Yu

When and How to Fool Explainable Models (and Humans) with Adversarial Examples

Reliable deployment of machine learning models such as neural networks continues to be challenging due to several limitations. Some of the main shortcomings are the lack of interpretability and the lack of robustness against adversarial…

Machine Learning · Computer Science 2025-02-18 Jon Vadillo , Roberto Santana , Jose A. Lozano

HyConEx: Hypernetwork classifier with counterfactual explanations for tabular data

In recent years, there has been a growing interest in explainable AI methods. In addition to making accurate predictions, we also want to understand what the model's decision is based on. One of the fundamental levels of interpretability is…

Machine Learning · Computer Science 2026-03-11 Patryk Marszałek , Kamil Książek , Oleksii Furman , Ulvi Movsum-zada , Przemysław Spurek , Marek Śmieja

Adversarial examples from computational constraints

Why are classifiers in high dimension vulnerable to "adversarial" perturbations? We show that it is likely not due to information theoretic limitations, but rather it could be due to computational constraints. First we prove that, for a…

Machine Learning · Statistics 2018-05-28 Sébastien Bubeck , Eric Price , Ilya Razenshteyn

Generating Feasible and Plausible Counterfactual Explanations for Outcome Prediction of Business Processes

In recent years, various machine and deep learning architectures have been successfully introduced to the field of predictive process analytics. Nevertheless, the inherent opacity of these algorithms poses a significant challenge for human…

Artificial Intelligence · Computer Science 2024-03-15 Alexander Stevens , Chun Ouyang , Johannes De Smedt , Catarina Moreira

Model-Agnostic Counterfactual Explanations for Consequential Decisions

Predictive models are being increasingly used to support consequential decision making at the individual level in contexts such as pretrial bail and loan approval. As a result, there is increasing social and legal pressure to provide…

Machine Learning · Computer Science 2020-03-02 Amir-Hossein Karimi , Gilles Barthe , Borja Balle , Isabel Valera

Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers

To construct interpretable explanations that are consistent with the original ML model, counterfactual examples---showing how the model's output changes with small perturbations to the input---have been proposed. This paper extends the work…

Machine Learning · Computer Science 2020-06-16 Divyat Mahajan , Chenhao Tan , Amit Sharma

Model-agnostic and Scalable Counterfactual Explanations via Reinforcement Learning

Counterfactual instances are a powerful tool to obtain valuable insights into automated decision processes, describing the necessary minimal changes in the input space to alter the prediction towards a desired target. Most previous…

Machine Learning · Computer Science 2021-06-07 Robert-Florian Samoilescu , Arnaud Van Looveren , Janis Klaise

A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers

Text classifiers are vulnerable to adversarial examples -- correctly-classified examples that are deliberately transformed to be misclassified while satisfying acceptability constraints. The conventional approach to finding adversarial…

Computation and Language · Computer Science 2024-05-21 Tom Roth , Inigo Jauregi Unanue , Alsharif Abuadbba , Massimo Piccardi