Aggregating explanation methods for stable and robust explainability

Laura Rieger; Lars Kai Hansen

Aggregating explanation methods for stable and robust explainability

Machine Learning 2020-03-23 v5 Artificial Intelligence Machine Learning

Authors: Laura Rieger , Lars Kai Hansen

Abstract

Despite a growing literature on explaining neural networks, no consensus has been reached on how to explain a neural network decision or how to evaluate an explanation. Our contributions in this paper are twofold. First, we investigate schemes to combine explanation methods and reduce model uncertainty to obtain a single aggregated explanation. We provide evidence that the aggregation is better at identifying important features, than on individual methods. Adversarial attacks on explanations is a recent active research topic. As our second contribution, we present evidence that aggregate explanations are much more robust to attacks than individual explanation methods.

Keywords

explainable artificial intelligence argumentation mining neural network

Cite

@article{arxiv.1903.00519,
  title  = {Aggregating explanation methods for stable and robust explainability},
  author = {Laura Rieger and Lars Kai Hansen},
  journal= {arXiv preprint arXiv:1903.00519},
  year   = {2020}
}

Related papers

View all related →

Computation and Language · Computer Science

Robustness of Explanation Methods for NLP Models

Shriya Atmakuri, Tejas Chheda, Dinesh Kandula, Nishant Yadav +2