Related papers: Conditional Feature Importance for Mixed Data

Testing Conditional Independence in Supervised Learning Algorithms

We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of…

Methodology · Statistics 2021-05-14 David S. Watson , Marvin N. Wright

Conditional Feature Importance revisited: Double Robustness, Efficiency and Inference

Conditional Feature Importance (CFI) is a classical variable importance measure that accounts for the relationship between the studied feature and the others. However, CFI has not yet been studied from a theoretical perspective because the…

Statistics Theory · Mathematics 2026-02-03 Angel Reyero-Lobo , Pierre Neuvial , Bertrand Thirion

Challenges in Variable Importance Ranking Under Correlation

Variable importance plays a pivotal role in interpretable machine learning as it helps measure the impact of factors on the output of the prediction model. Model agnostic methods based on the generation of "null" features via permutation…

Machine Learning · Statistics 2024-02-07 Annie Liang , Thomas Jemielita , Andy Liaw , Vladimir Svetnik , Lingkang Huang , Richard Baumgartner , Jason M. Klusowski

Statistically Valid Variable Importance Assessment through Conditional Permutations

Variable importance assessment has become a crucial step in machine-learning applications when using complex learners, such as deep neural networks, on large-scale data. Removal-based importance assessment is currently the reference…

Machine Learning · Computer Science 2023-10-27 Ahmad Chamma , Denis A. Engemann , Bertrand Thirion

Model-agnostic Feature Importance and Effects with Dependent Features -- A Conditional Subgroup Approach

The interpretation of feature importance in machine learning models is challenging when features are dependent. Permutation feature importance (PFI) ignores such dependencies, which can cause misleading interpretations due to extrapolation.…

Machine Learning · Statistics 2023-11-09 Christoph Molnar , Gunnar König , Bernd Bischl , Giuseppe Casalicchio

Disentangled Feature Importance

Feature importance (FI) measures are widely used to assess the contributions of predictors to an outcome, but they may target different notions of relevance. When predictors are correlated, traditional statistical FI methods are often…

Machine Learning · Statistics 2026-03-17 Jin-Hong Du , Kathryn Roeder , Larry Wasserman

The Berkelmans-Pries Feature Importance Method: A Generic Measure of Informativeness of Features

Over the past few years, the use of machine learning models has emerged as a generic and powerful means for prediction purposes. At the same time, there is a growing demand for interpretability of prediction models. To determine which…

Machine Learning · Computer Science 2023-01-13 Joris Pries , Guus Berkelmans , Sandjai Bhulai , Rob van der Mei

Discovering Conditionally Salient Features with Statistical Guarantees

The goal of feature selection is to identify important features that are relevant to explain an outcome variable. Most of the work in this domain has focused on identifying globally relevant features, which are features that are related to…

Machine Learning · Statistics 2019-05-30 Jaime Roquero Gimenez , James Zou

Variable Importance in High-Dimensional Settings Requires Grouping

Explaining the decision process of machine learning algorithms is nowadays crucial for both model's performance enhancement and human comprehension. This can be achieved by assessing the variable importance of single variables, even for…

Machine Learning · Computer Science 2023-12-19 Ahmad Chamma , Bertrand Thirion , Denis A. Engemann

Feature Importance Measure for Non-linear Learning Algorithms

Complex problems may require sophisticated, non-linear learning methods such as kernel machines or deep neural networks to achieve state of the art prediction accuracies. However, high prediction accuracies are not the only objective to…

Artificial Intelligence · Computer Science 2016-11-24 Marina M. -C. Vidovic , Nico Görnitz , Klaus-Robert Müller , Marius Kloft

Inherent Inconsistencies of Feature Importance

The rapid advancement and widespread adoption of machine learning-driven technologies have underscored the practical and ethical need for creating interpretable artificial intelligence systems. Feature importance, a method that assigns…

Machine Learning · Computer Science 2023-12-07 Nimrod Harel , Uri Obolski , Ran Gilad-Bachrach

CCMI : Classifier based Conditional Mutual Information Estimation

Conditional Mutual Information (CMI) is a measure of conditional dependence between random variables X and Y, given another random variable Z. It can be used to quantify conditional dependence among variables in many data-driven inference…

Machine Learning · Computer Science 2019-06-10 Sudipto Mukherjee , Himanshu Asnani , Sreeram Kannan

The Conditional Prediction Function: A Novel Technique to Control False Discovery Rate for Complex Models

In modern scientific research, the objective is often to identify which variables are associated with an outcome among a large class of potential predictors. This goal can be achieved by selecting variables in a manner that controls the the…

Methodology · Statistics 2023-10-10 Yushu Shi , Michael Martens

Ultra-marginal Feature Importance: Learning from Data with Causal Guarantees

Scientists frequently prioritize learning from data rather than training the best possible model; however, research in machine learning often prioritizes the latter. Marginal contribution feature importance (MCI) was developed to break this…

Machine Learning · Statistics 2024-11-12 Joseph Janssen , Vincent Guan , Elina Robeva

Visualizing the Feature Importance for Black Box Models

In recent years, a large amount of model-agnostic methods to improve the transparency, trustability and interpretability of machine learning models have been developed. We introduce local feature importance as a local version of a recent…

Machine Learning · Statistics 2020-07-15 Giuseppe Casalicchio , Christoph Molnar , Bernd Bischl

Conditional Feature Importance with Generative Modeling Using Adversarial Random Forests

This paper proposes a method for measuring conditional feature importance via generative modeling. In explainable artificial intelligence (XAI), conditional feature importance assesses the impact of a feature on a prediction model's…

Machine Learning · Statistics 2025-01-22 Kristin Blesch , Niklas Koenen , Jan Kapar , Pegah Golchian , Lukas Burk , Markus Loecher , Marvin N. Wright

Relative Feature Importance

Interpretable Machine Learning (IML) methods are used to gain insight into the relevance of a feature of interest for the performance of a model. Commonly used IML methods differ in whether they consider features of interest in isolation,…

Machine Learning · Statistics 2021-04-23 Gunnar König , Christoph Molnar , Bernd Bischl , Moritz Grosse-Wentrup

A Rigorous Information-Theoretic Definition of Redundancy and Relevancy in Feature Selection Based on (Partial) Information Decomposition

Selecting a minimal feature set that is maximally informative about a target variable is a central task in machine learning and statistics. Information theory provides a powerful framework for formulating feature selection algorithms --…

Information Theory · Computer Science 2023-05-05 Patricia Wollstadt , Sebastian Schmitt , Michael Wibral

Hierarchical Variable Importance with Statistical Control for Medical Data-Based Prediction

Recent advances in machine learning have greatly expanded the repertoire of predictive methods for medical imaging. However, the interpretability of complex models remains a challenge, which limits their utility in medical applications.…

Machine Learning · Statistics 2025-08-13 Joseph Paillard , Antoine Collas , Denis A. Engemann , Bertrand Thirion

Knockoffs for the mass: new feature importance statistics with false discovery guarantees

An important problem in machine learning and statistics is to identify features that causally affect the outcome. This is often impossible to do from purely observational data, and a natural relaxation is to identify features that are…

Machine Learning · Statistics 2019-05-30 Jaime Roquero Gimenez , Amirata Ghorbani , James Zou