Related papers: Generalized Permutation Framework for Testing Mode…

One Permutation Is All You Need: Fast, Reliable Variable Importance and Model Stress-Testing

Reliable estimation of feature contributions in machine learning models is essential for trust, transparency and regulatory compliance, especially when models are proprietary or otherwise operate as black boxes. While permutation-based…

Machine Learning · Statistics 2025-12-24 Albert Dorador

Asymptotic Unbiasedness of the Permutation Importance Measure in Random Forest Models

Variable selection in sparse regression models is an important task as applications ranging from biomedical research to econometrics have shown. Especially for higher dimensional regression problems, for which the link function between…

Machine Learning · Statistics 2019-12-10 Burim Ramosaj , Markus Pauly

The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance

Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset.…

Machine Learning · Computer Science 2024-04-03 Jon Donnelly , Srikar Katta , Cynthia Rudin , Edward P. Browne

New methods for multiple testing in permutation inference for the general linear model

Permutation methods are commonly used to test significance of regressors of interest in general linear models (GLMs) for functional (image) data sets, in particular for neuroimaging applications as they rely on mild assumptions. Permutation…

Methodology · Statistics 2021-11-23 Tomas Mrkvicka , Mari Myllymaki , Mikko Kuronen , Naveen Naidu Narisetty

Statistically Valid Variable Importance Assessment through Conditional Permutations

Variable importance assessment has become a crucial step in machine-learning applications when using complex learners, such as deep neural networks, on large-scale data. Removal-based importance assessment is currently the reference…

Machine Learning · Computer Science 2023-10-27 Ahmad Chamma , Denis A. Engemann , Bertrand Thirion

Permutation testing in high-dimensional linear models: an empirical investigation

Permutation testing in linear models, where the number of nuisance coefficients is smaller than the sample size, is a well-studied topic. The common approach of such tests is to permute residuals after regressing on the nuisance covariates.…

Methodology · Statistics 2020-10-09 Jesse Hemerik , Magne Thoresen , Livio Finos

Permutation-based Hypothesis Testing for Neural Networks

Neural networks are powerful predictive models, but they provide little insight into the nature of relationships between predictors and outcomes. Although numerous methods have been proposed to quantify the relative contributions of input…

Methodology · Statistics 2023-01-30 Francesca Mandel , Ian Barnett

Detection of mean changes in partially observed functional data

We propose a test for a change in the mean for a sequence of functional observations that are only partially observed on subsets of the domain, with no information available on the complement. The framework accommodates important scenarios,…

Methodology · Statistics 2025-10-10 Šárka Hudecová , Claudia Kirch

The Exchangeability Assumption for Permutation Tests of Multiple Regression Models: Implications for Statistics and Data Science Educators

Permutation tests are a powerful and flexible approach to inference via resampling. As computational methods become more ubiquitous in the statistics curriculum, use of permutation tests has become more tractable. At the heart of the…

Methodology · Statistics 2025-06-09 Johanna Hardin , Lauren Quesada , Julie Ye , Nicholas J. Horton

Evaluating Fairness Using Permutation Tests

Machine learning models are central to people's lives and impact society in ways as fundamental as determining how people access information. The gravity of these models imparts a responsibility to model developers to ensure that they are…

Applications · Statistics 2020-07-13 Cyrus DiCiccio , Sriram Vasudevan , Kinjal Basu , Krishnaram Kenthapadi , Deepak Agarwal

Model-independent variable selection via the rule-based variable priority

While achieving high prediction accuracy is a fundamental goal in machine learning, an equally important task is finding a small number of features with high explanatory power. One popular selection technique is permutation importance,…

Machine Learning · Statistics 2024-10-02 Min Lu , Hemant Ishwaran

Permutation-based multi-objective evolutionary feature selection for high-dimensional data

Feature selection is a critical step in the analysis of high-dimensional data, where the number of features often vastly exceeds the number of samples. Effective feature selection not only improves model performance and interpretability but…

Machine Learning · Computer Science 2025-01-27 Raquel Espinosa , Gracia Sánchez , José Palma , Fernando Jiménez

Targeted Learning for Variable Importance

Variable importance is one of the most widely used measures for interpreting machine learning with significant interest from both statistics and machine learning communities. Recently, increasing attention has been directed toward…

Machine Learning · Statistics 2025-12-22 Xiaohan Wang , Yunzhe Zhou , Giles Hooker

Variable Importance Clouds: A Way to Explore Variable Importance for the Set of Good Models

Variable importance is central to scientific studies, including the social sciences and causal inference, healthcare, and other domains. However, current notions of variable importance are often tied to a specific predictive model. This is…

Machine Learning · Statistics 2020-02-11 Jiayun Dong , Cynthia Rudin

A Simple Approach for Local and Global Variable Importance in Nonlinear Regression Models

The ability to interpret machine learning models has become increasingly important as their usage in data science continues to rise. Most current interpretability methods are optimized to work on either (\textit{i}) a global scale, where…

Methodology · Statistics 2023-08-11 Emily T. Winn-Nuñez , Maryclare Griffin , Lorin Crawford

A general framework for inference on algorithm-agnostic variable importance

In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response -- in other words, to gauge the variable importance of features. Most recent work on…

Methodology · Statistics 2025-10-23 Brian D. Williamson , Peter B. Gilbert , Noah R. Simon , Marco Carone

A Simple and Effective Model-Based Variable Importance Measure

In the era of "big data", it is becoming more of a challenge to not only build state-of-the-art predictive models, but also gain an understanding of what's really going on in the data. For example, it is often of interest to know which, if…

Machine Learning · Statistics 2018-05-15 Brandon M. Greenwell , Bradley C. Boehmke , Andrew J. McCarthy

Variable Importance in High-Dimensional Settings Requires Grouping

Explaining the decision process of machine learning algorithms is nowadays crucial for both model's performance enhancement and human comprehension. This can be achieved by assessing the variable importance of single variables, even for…

Machine Learning · Computer Science 2023-12-19 Ahmad Chamma , Bertrand Thirion , Denis A. Engemann

Permutation-based multiple testing when fitting many generalized linear models

In many applied sciences a popular analysis strategy for high-dimensional data is to fit many multivariate generalized linear models in parallel. This paper presents a novel approach to address the resulting multiple testing problem by…

Statistics Theory · Mathematics 2024-10-07 Riccardo De Santis , Jelle J. Goeman , Samuel Davenport , Jesse Hemerik , Livio Finos

Variable selection for general index models via sliced inverse regression

Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential…

Methodology · Statistics 2014-09-24 Bo Jiang , Jun S. Liu