Concept Gradient: Concept-based Interpretation Without Linear Assumption

Andrew Bai; Chih-Kuan Yeh; Pradeep Ravikumar; Neil Y. C. Lin; Cho-Jui Hsieh

Concept Gradient: Concept-based Interpretation Without Linear Assumption

Machine Learning 2024-02-07 v2

Authors: Andrew Bai , Chih-Kuan Yeh , Pradeep Ravikumar , Neil Y. C. Lin , Cho-Jui Hsieh

Abstract

Concept-based interpretations of black-box models are often more intuitive for humans to understand. The most widely adopted approach for concept-based interpretation is Concept Activation Vector (CAV). CAV relies on learning a linear relation between some latent representation of a given model and concepts. The linear separability is usually implicitly assumed but does not hold true in general. In this work, we started from the original intent of concept-based interpretation and proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We showed that for a general (potentially non-linear) concept, we can mathematically evaluate how a small change of concept affecting the model's prediction, which leads to an extension of gradient-based interpretation to the concept space. We demonstrated empirically that CG outperforms CAV in both toy examples and real world datasets.

Keywords

concept bottleneck model

Cite

@article{arxiv.2208.14966,
  title  = {Concept Gradient: Concept-based Interpretation Without Linear Assumption},
  author = {Andrew Bai and Chih-Kuan Yeh and Pradeep Ravikumar and Neil Y. C. Lin and Cho-Jui Hsieh},
  journal= {arXiv preprint arXiv:2208.14966},
  year   = {2024}
}

Comments

21 pages, 7 figures, published in ICLR 2023

Concept Gradient: Concept-based Interpretation Without Linear Assumption

Abstract

Keywords

Cite

Comments

Related papers