English

Influence-Driven Explanations for Bayesian Network Classifiers

Artificial Intelligence 2021-03-11 v3

Abstract

One of the most pressing issues in AI in recent years has been the need to address the lack of explainability of many of its models. We focus on explanations for discrete Bayesian network classifiers (BCs), targeting greater transparency of their inner workings by including intermediate variables in explanations, rather than just the input and output variables as is standard practice. The proposed influence-driven explanations (IDXs) for BCs are systematically generated using the causal relationships between variables within the BC, called influences, which are then categorised by logical requirements, called relation properties, according to their behaviour. These relation properties both provide guarantees beyond heuristic explanation methods and allow the information underpinning an explanation to be tailored to a particular context's and user's requirements, e.g., IDXs may be dialectical or counterfactual. We demonstrate IDXs' capability to explain various forms of BCs, e.g., naive or multi-label, binary or categorical, and also integrate recent approaches to explanations for BCs from the literature. We evaluate IDXs with theoretical and empirical analyses, demonstrating their considerable advantages when compared with existing explanation methods.

Keywords

Cite

@article{arxiv.2012.05773,
  title  = {Influence-Driven Explanations for Bayesian Network Classifiers},
  author = {Antonio Rago and Emanuele Albini and Pietro Baroni and Francesca Toni},
  journal= {arXiv preprint arXiv:2012.05773},
  year   = {2021}
}

Comments

11 pages, 2 figures

R2 v1 2026-06-23T20:52:40.489Z