Related papers: Towards Compositionality in Concept Learning

Intrinsic Concept Extraction Based on Compositional Interpretability

Unsupervised Concept Extraction aims to extract concepts from a single image; however, existing methods suffer from the inability to extract composable intrinsic concepts. To address this, this paper introduces a new task called…

Computer Vision and Pattern Recognition · Computer Science 2026-04-13 Hanyu Shi , Hong Tao , Guoheng Huang , Jianbin Jiang , Xuhang Chen , Chi-Man Pun , Shanhu Wang , Pan Pan

MCCE: Missingness-aware Causal Concept Explainer

Causal concept effect estimation is gaining increasing interest in the field of interpretable machine learning. This general approach explains the behaviors of machine learning models by estimating the causal effect of human-understandable…

Machine Learning · Computer Science 2024-11-15 Jifan Gao , Guanhua Chen

Towards Automatic Concept-based Explanations

Interpretability has become an important topic of research as more machine learning (ML) models are deployed and widely used to make important decisions. Most of the current explanation methods provide explanations through feature…

Machine Learning · Statistics 2019-10-09 Amirata Ghorbani , James Wexler , James Zou , Been Kim

Overlooked factors in concept-based explanations: Dataset choice, concept learnability, and human capability

Concept-based interpretability methods aim to explain deep neural network model predictions using a predefined set of semantic concepts. These methods evaluate a trained model on a new, "probe" dataset and correlate model predictions with…

Computer Vision and Pattern Recognition · Computer Science 2023-05-15 Vikram V. Ramaswamy , Sunnie S. Y. Kim , Ruth Fong , Olga Russakovsky

EDUCE: Explaining model Decisions through Unsupervised Concepts Extraction

Providing explanations along with predictions is crucial in some text processing tasks. Therefore, we propose a new self-interpretable model that performs output prediction and simultaneously provides an explanation in terms of the presence…

Machine Learning · Computer Science 2019-09-30 Diane Bouchacourt , Ludovic Denoyer

Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models

To build intelligent machine learning systems, there are two broad approaches. One approach is to build inherently interpretable models, as endeavored by the growing field of causal representation learning. The other approach is to build…

Machine Learning · Computer Science 2024-12-10 Goutham Rajendran , Simon Buchholz , Bryon Aragam , Bernhard Schölkopf , Pradeep Ravikumar

Interpretable Compositional Convolutional Neural Networks

The reasonable definition of semantic interpretability presents the core challenge in explainable AI. This paper proposes a method to modify a traditional convolutional neural network (CNN) into an interpretable compositional CNN, in order…

Computer Vision and Pattern Recognition · Computer Science 2021-07-12 Wen Shen , Zhihua Wei , Shikun Huang , Binbin Zhang , Jiaqi Fan , Ping Zhao , Quanshi Zhang

Explaining Classifiers with Causal Concept Effect (CaCE)

How can we understand classification decisions made by deep neural networks? Many existing explainability methods rely solely on correlations and fail to account for confounding, which may result in potentially misleading explanations. To…

Machine Learning · Computer Science 2020-03-02 Yash Goyal , Amir Feder , Uri Shalit , Been Kim

From Isolation to Entanglement: When Do Interpretability Methods Identify and Disentangle Known Concepts?

A central goal of interpretability is to recover representations of causally relevant concepts from the activations of neural networks. The quality of these concept representations is typically evaluated in isolation, and under implicit…

Machine Learning · Computer Science 2025-12-18 Aaron Mueller , Andrew Lee , Shruti Joshi , Ekdeep Singh Lubana , Dhanya Sridhar , Patrik Reizinger

Radically Compositional Cognitive Concepts

Despite ample evidence that our concepts, our cognitive architecture, and mathematics itself are all deeply compositional, few models take advantage of this structure. We therefore propose a radically compositional approach to computational…

Neurons and Cognition · Quantitative Biology 2019-11-18 Toby B. St Clere Smithe

Understanding Inter-Concept Relationships in Concept-Based Models

Concept-based explainability methods provide insight into deep learning systems by constructing explanations using human-understandable concepts. While the literature on human reasoning demonstrates that we exploit relationships between…

Machine Learning · Computer Science 2024-05-29 Naveen Raman , Mateo Espinosa Zarlenga , Mateja Jamnik

Concept-based Explanations using Non-negative Concept Activation Vectors and Decision Tree for CNN Models

This paper evaluates whether training a decision tree based on concepts extracted from a concept-based explainer can increase interpretability for Convolutional Neural Networks (CNNs) models and boost the fidelity and performance of the…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Gayda Mutahar , Tim Miller

A Complexity-Based Theory of Compositionality

Compositionality is believed to be fundamental to intelligence. In humans, it underlies the structure of thought, language, and higher-level reasoning. In AI, compositional representations can enable a powerful form of out-of-distribution…

Computation and Language · Computer Science 2025-06-04 Eric Elmoznino , Thomas Jiralerspong , Yoshua Bengio , Guillaume Lajoie

From Mechanistic to Compositional Interpretability

Mechanistic interpretability aims to explain neural model behaviour by reverse-engineering learned computational structure into human-understandable components. Without a formal framework, however, mechanistic explanations cannot be…

Machine Learning · Computer Science 2026-05-12 Ward Gauderis , Thomas Dooms , Steven T. Holmer , Kola Ayonrinde , Geraint A. Wiggins

Enhancing the Comprehensibility of Text Explanations via Unsupervised Concept Discovery

Concept-based explainable approaches have emerged as a promising method in explainable AI because they can interpret models in a way that aligns with human reasoning. However, their adaption in the text domain remains limited. Most existing…

Computation and Language · Computer Science 2025-05-27 Yifan Sun , Danding Wang , Qiang Sheng , Juan Cao , Jintao Li

When are Post-hoc Conceptual Explanations Identifiable?

Interest in understanding and factorizing learned embedding spaces through conceptual explanations is steadily growing. When no human concept labels are available, concept discovery methods search trained embedding spaces for interpretable…

Machine Learning · Statistics 2023-06-07 Tobias Leemann , Michael Kirchhof , Yao Rong , Enkelejda Kasneci , Gjergji Kasneci

Discovering Concepts in Learned Representations using Statistical Inference and Interactive Visualization

Concept discovery is one of the open problems in the interpretability literature that is important for bridging the gap between non-deep learning experts and model end-users. Among current formulations, concepts defines them by as a…

Machine Learning · Computer Science 2022-02-11 Adrianna Janik , Kris Sankaran

Concept-Based Abductive and Contrastive Explanations for Behaviors of Vision Models

*Concept-based explanations* offer a promising approach for explaining the predictions of deep neural networks in terms of high-level, human-understandable concepts. However, existing methods either do not establish a causal connection…

Machine Learning · Computer Science 2026-05-08 Ronaldo Canizales , Divya Gopinath , Corina Păsăreanu , Ravi Mangal

Investigating Inner Properties of Multimodal Representation and Semantic Compositionality with Brain-based Componential Semantics

Multimodal models have been proven to outperform text-based approaches on learning semantic representations. However, it still remains unclear what properties are encoded in multimodal representations, in what aspects do they outperform the…

Computation and Language · Computer Science 2017-11-23 Shaonan Wang , Jiajun Zhang , Nan Lin , Chengqing Zong

Bi-ICE: An Inner Interpretable Framework for Image Classification via Bi-directional Interactions between Concept and Input Embeddings

Inner interpretability is a promising field aiming to uncover the internal mechanisms of AI systems through scalable, automated methods. While significant research has been conducted on large language models, limited attention has been paid…

Computer Vision and Pattern Recognition · Computer Science 2025-12-09 Jinyung Hong , Yearim Kim , Keun Hee Park , Sangyu Han , Nojun Kwak , Theodore P. Pavlic