Bridging Interpretability and Robustness Using LIME-Guided Model Refinement

Navid Nayyem; Abdullah Rakin; Longwei Wang

Bridging Interpretability and Robustness Using LIME-Guided Model Refinement

Machine Learning 2024-12-30 v1 Artificial Intelligence

Authors: Navid Nayyem , Abdullah Rakin , Longwei Wang

Abstract

This paper explores the intricate relationship between interpretability and robustness in deep learning models. Despite their remarkable performance across various tasks, deep learning models often exhibit critical vulnerabilities, including susceptibility to adversarial attacks, over-reliance on spurious correlations, and a lack of transparency in their decision-making processes. To address these limitations, we propose a novel framework that leverages Local Interpretable Model-Agnostic Explanations (LIME) to systematically enhance model robustness. By identifying and mitigating the influence of irrelevant or misleading features, our approach iteratively refines the model, penalizing reliance on these features during training. Empirical evaluations on multiple benchmark datasets demonstrate that LIME-guided refinement not only improves interpretability but also significantly enhances resistance to adversarial perturbations and generalization to out-of-distribution data.

Keywords

interpretable machine learning adversarial robustness deep learning

Cite

@article{arxiv.2412.18952,
  title  = {Bridging Interpretability and Robustness Using LIME-Guided Model Refinement},
  author = {Navid Nayyem and Abdullah Rakin and Longwei Wang},
  journal= {arXiv preprint arXiv:2412.18952},
  year   = {2024}
}

Comments

10 pages, 15 figures

Bridging Interpretability and Robustness Using LIME-Guided Model Refinement

Abstract

Keywords

Cite

Comments

Related papers