Related papers: Provably Explaining Neural Additive Models
Recently, there has been considerable progress on designing algorithms with provable guarantees -- typically using linear algebraic methods -- for parameter learning in latent variable models. But designing provable algorithms for inference…
Despite significant advancements in post-hoc explainability techniques for neural networks, many current methods rely on heuristics and do not provide formally provable guarantees over the explanations provided. Recent work has shown that…
Generalized Additive Models (GAMs) are commonly considered *interpretable* within the ML community, as their structure makes the relationship between inputs and outputs relatively understandable. Therefore, it may seem natural to…
Building explainable systems is a critical problem in the field of Natural Language Processing (NLP), since most machine learning models provide no explanations for the predictions. Existing approaches for explainable machine learning…
In recent years, there has been increasing interest in explanation methods for neural model predictions that offer precise formal guarantees. These include abductive (respectively, contrastive) methods, which aim to compute minimal subsets…
Neural additive model (NAM) is a recently proposed explainable artificial intelligence (XAI) method that utilizes neural network-based architectures. Given the advantages of neural networks, NAMs provide intuitive explanations for their…
Neural additive models (NAMs) enhance the transparency of deep neural networks by handling input features in separate additive sub-networks. However, they lack inherent mechanisms that provide calibrated uncertainties and enable selection…
Understanding the behavior of learned classifiers is an important task, and various black-box explanations, logical reasoning approaches, and model-specific methods have been proposed. In this paper, we introduce probabilistic sufficient…
*Automated circuit discovery* is a central tool in mechanistic interpretability for identifying the internal components of neural networks responsible for specific behaviors. While prior methods have made significant progress, they…
Recently generating natural language explanations has shown very promising results in not only offering interpretable explanations but also providing additional information and supervision for prediction. However, existing approaches…
Formal verification is only as good as the specification of a system, which is also true for neural network verification. Existing specifications follow the paradigm of data as specification, where the local neighborhood around a reference…
Neural predictive models have achieved remarkable performance improvements in various natural language processing tasks. However, most neural predictive models suffer from the lack of explainability of predictions, limiting their practical…
Probabilistic inferences distill knowledge from graphs to aid human make important decisions. Due to the inherent uncertainty in the model and the complexity of the knowledge, it is desirable to help the end-users understand the inference…
We consider the problem of explaining the predictions of an arbitrary blackbox model $f$: given query access to $f$ and an instance $x$, output a small set of $x$'s features that in conjunction essentially determines $f(x)$. We design an…
Although deep models achieve high predictive performance, it is difficult for humans to understand the predictions they made. Explainability is important for real-world applications to justify their reliability. Many example-based…
Deep neural networks (DNNs) have shown exceptional performances in a wide range of tasks and have become the go-to method for problems requiring high-level predictive power. There has been extensive research on how DNNs arrive at their…
There is a growing concern about typically opaque decision-making with high-performance machine learning algorithms. Providing an explanation of the reasoning process in domain-specific terms can be crucial for adoption in risk-sensitive…
A barrier to the wider adoption of neural networks is their lack of interpretability. While local explanation methods exist for one prediction, most global attributions still reduce neural network decisions to a single set of features. In…
Much like admissibility is the key concept underlying preferred semantics, strong admissibility is the key concept underlying grounded semantics, as membership of a strongly admissible set is sufficient to show membership of the grounded…
Improving the explainability of the results from machine learning methods has become an important research goal. Here, we study the problem of making clusters more interpretable by extending a recent approach of [Davidson et al., NeurIPS…