Related papers: Provably Explaining Neural Additive Models

Provable Algorithms for Inference in Topic Models

Recently, there has been considerable progress on designing algorithms with provable guarantees -- typically using linear algebraic methods -- for parameter learning in latent variable models. But designing provable algorithms for inference…

Machine Learning · Computer Science 2016-05-30 Sanjeev Arora , Rong Ge , Frederic Koehler , Tengyu Ma , Ankur Moitra

Explaining, Fast and Slow: Abstraction and Refinement of Provable Explanations

Despite significant advancements in post-hoc explainability techniques for neural networks, many current methods rely on heuristics and do not provide formally provable guarantees over the explanations provided. Recent work has shown that…

Machine Learning · Computer Science 2025-06-11 Shahaf Bassan , Yizhak Yisrael Elboher , Tobias Ladner , Matthias Althoff , Guy Katz

Additive Models Explained: A Computational Complexity Approach

Generalized Additive Models (GAMs) are commonly considered *interpretable* within the ML community, as their structure makes the relationship between inputs and outputs relatively understandable. Therefore, it may seem natural to…

Machine Learning · Computer Science 2026-02-06 Shahaf Bassan , Michal Moshkovitz , Guy Katz

Towards Explainable NLP: A Generative Explanation Framework for Text Classification

Building explainable systems is a critical problem in the field of Natural Language Processing (NLP), since most machine learning models provide no explanations for the predictions. Existing approaches for explainable machine learning…

Computation and Language · Computer Science 2019-06-12 Hui Liu , Qingyu Yin , William Yang Wang

Cardinality-Minimal Explanations for Monotonic Neural Networks

In recent years, there has been increasing interest in explanation methods for neural model predictions that offer precise formal guarantees. These include abductive (respectively, contrastive) methods, which aim to compute minimal subsets…

Machine Learning · Computer Science 2023-05-03 Ouns El Harzli , Bernardo Cuenca Grau , Ian Horrocks

BayesNAM: Leveraging Inconsistency for Reliable Explanations

Neural additive model (NAM) is a recently proposed explainable artificial intelligence (XAI) method that utilizes neural network-based architectures. Given the advantages of neural networks, NAMs provide intuitive explanations for their…

Machine Learning · Computer Science 2024-11-12 Hoki Kim , Jinseong Park , Yujin Choi , Seungyun Lee , Jaewook Lee

Improving Neural Additive Models with Bayesian Principles

Neural additive models (NAMs) enhance the transparency of deep neural networks by handling input features in separate additive sub-networks. However, they lack inherent mechanisms that provide calibrated uncertainties and enable selection…

Machine Learning · Statistics 2024-10-29 Kouroche Bouchiat , Alexander Immer , Hugo Yèche , Gunnar Rätsch , Vincent Fortuin

Probabilistic Sufficient Explanations

Understanding the behavior of learned classifiers is an important task, and various black-box explanations, logical reasoning approaches, and model-specific methods have been proposed. In this paper, we introduce probabilistic sufficient…

Machine Learning · Computer Science 2021-05-24 Eric Wang , Pasha Khosravi , Guy Van den Broeck

Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees

*Automated circuit discovery* is a central tool in mechanistic interpretability for identifying the internal components of neural networks responsible for specific behaviors. While prior methods have made significant progress, they…

Machine Learning · Computer Science 2026-02-20 Itamar Hadad , Guy Katz , Shahaf Bassan

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables

Recently generating natural language explanations has shown very promising results in not only offering interpretable explanations but also providing additional information and supervision for prediction. However, existing approaches…

Computation and Language · Computer Science 2022-05-30 Wangchunshu Zhou , Jinyi Hu , Hanlin Zhang , Xiaodan Liang , Maosong Sun , Chenyan Xiong , Jian Tang

Learning Minimal Neural Specifications

Formal verification is only as good as the specification of a system, which is also true for neural network verification. Existing specifications follow the paradigm of data as specification, where the local neighborhood around a reference…

Machine Learning · Computer Science 2025-03-17 Chuqin Geng , Zhaoyue Wang , Haolin Ye , Xujie Si

You Can Do Better! If You Elaborate the Reason When Making Prediction

Neural predictive models have achieved remarkable performance improvements in various natural language processing tasks. However, most neural predictive models suffer from the lack of explainability of predictions, limiting their practical…

Computation and Language · Computer Science 2021-06-01 Dongfang Li , Jingcong Tao , Qingcai Chen , Baotian Hu

Scalable Explanation of Inferences on Large Graphs

Probabilistic inferences distill knowledge from graphs to aid human make important decisions. Due to the inherent uncertainty in the model and the complexity of the knowledge, it is desirable to help the end-users understand the inference…

Social and Information Networks · Computer Science 2019-08-21 Chao Chen , Yifei Liu , Xi Zhang , Sihong Xie

Provably efficient, succinct, and precise explanations

We consider the problem of explaining the predictions of an arbitrary blackbox model $f$: given query access to $f$ and an instance $x$, output a small set of $x$'s features that in conjunction essentially determines $f(x)$. We design an…

Machine Learning · Computer Science 2021-11-03 Guy Blanc , Jane Lange , Li-Yang Tan

Training Deep Models to be Explained with Fewer Examples

Although deep models achieve high predictive performance, it is difficult for humans to understand the predictions they made. Explainability is important for real-world applications to justify their reliability. Many example-based…

Machine Learning · Statistics 2021-12-08 Tomoharu Iwata , Yuya Yoshikawa

Structural Neural Additive Models: Enhanced Interpretable Machine Learning

Deep neural networks (DNNs) have shown exceptional performances in a wide range of tasks and have become the go-to method for problems requiring high-level predictive power. There has been extensive research on how DNNs arrive at their…

Machine Learning · Computer Science 2023-02-21 Mattias Luber , Anton Thielmann , Benjamin Säfken

Interpretable by Design: Learning Predictors by Composing Interpretable Queries

There is a growing concern about typically opaque decision-making with high-performance machine learning algorithms. Providing an explanation of the reasoning process in domain-specific terms can be crucial for adoption in risk-sensitive…

Computer Vision and Pattern Recognition · Computer Science 2022-11-28 Aditya Chattopadhyay , Stewart Slocum , Benjamin D. Haeffele , Rene Vidal , Donald Geman

Global Explanations of Neural Networks: Mapping the Landscape of Predictions

A barrier to the wider adoption of neural networks is their lack of interpretability. While local explanation methods exist for one prediction, most global attributions still reduce neural network decisions to a single set of features. In…

Machine Learning · Computer Science 2019-02-08 Mark Ibrahim , Melissa Louie , Ceena Modarres , John Paisley

Strong Admissibility, a Tractable Algorithmic Approach (proofs)

Much like admissibility is the key concept underlying preferred semantics, strong admissibility is the key concept underlying grounded semantics, as membership of a strongly admissible set is sufficient to show membership of the grounded…

Artificial Intelligence · Computer Science 2022-04-08 Martin Caminada , Sri Harikrishnan

Efficient Algorithms for Generating Provably Near-Optimal Cluster Descriptors for Explainability

Improving the explainability of the results from machine learning methods has become an important research goal. Here, we study the problem of making clusters more interpretable by extending a recent approach of [Davidson et al., NeurIPS…

Data Structures and Algorithms · Computer Science 2020-02-10 Prathyush Sambaturu , Aparna Gupta , Ian Davidson , S. S. Ravi , Anil Vullikanti , Andrew Warren