English

Position: Do Not Explain Vision Models Without Context

Computer Vision and Pattern Recognition 2024-06-04 v3 Artificial Intelligence Machine Learning

Abstract

Does the stethoscope in the picture make the adjacent person a doctor or a patient? This, of course, depends on the contextual relationship of the two objects. If it's obvious, why don't explanation methods for vision models use contextual information? In this paper, we (1) review the most popular methods of explaining computer vision models by pointing out that they do not take into account context information, (2) show examples of failures of popular XAI methods, (3) provide examples of real-world use cases where spatial context plays a significant role, (4) propose new research directions that may lead to better use of context information in explaining computer vision models, (5) argue that a change in approach to explanations is needed from 'where' to 'how'.

Cite

@article{arxiv.2404.18316,
  title  = {Position: Do Not Explain Vision Models Without Context},
  author = {Paulina Tomaszewska and Przemysław Biecek},
  journal= {arXiv preprint arXiv:2404.18316},
  year   = {2024}
}

Comments

Accepted at International Conference on Machine Learning (ICML) 2024

R2 v1 2026-06-28T16:09:08.236Z