Related papers: Generalized Permutation Framework for Testing Mode…
Reliable estimation of feature contributions in machine learning models is essential for trust, transparency and regulatory compliance, especially when models are proprietary or otherwise operate as black boxes. While permutation-based…
Variable selection in sparse regression models is an important task as applications ranging from biomedical research to econometrics have shown. Especially for higher dimensional regression problems, for which the link function between…
Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset.…
Permutation methods are commonly used to test significance of regressors of interest in general linear models (GLMs) for functional (image) data sets, in particular for neuroimaging applications as they rely on mild assumptions. Permutation…
Variable importance assessment has become a crucial step in machine-learning applications when using complex learners, such as deep neural networks, on large-scale data. Removal-based importance assessment is currently the reference…
Permutation testing in linear models, where the number of nuisance coefficients is smaller than the sample size, is a well-studied topic. The common approach of such tests is to permute residuals after regressing on the nuisance covariates.…
Neural networks are powerful predictive models, but they provide little insight into the nature of relationships between predictors and outcomes. Although numerous methods have been proposed to quantify the relative contributions of input…
We propose a test for a change in the mean for a sequence of functional observations that are only partially observed on subsets of the domain, with no information available on the complement. The framework accommodates important scenarios,…
Permutation tests are a powerful and flexible approach to inference via resampling. As computational methods become more ubiquitous in the statistics curriculum, use of permutation tests has become more tractable. At the heart of the…
Machine learning models are central to people's lives and impact society in ways as fundamental as determining how people access information. The gravity of these models imparts a responsibility to model developers to ensure that they are…
While achieving high prediction accuracy is a fundamental goal in machine learning, an equally important task is finding a small number of features with high explanatory power. One popular selection technique is permutation importance,…
Feature selection is a critical step in the analysis of high-dimensional data, where the number of features often vastly exceeds the number of samples. Effective feature selection not only improves model performance and interpretability but…
Variable importance is one of the most widely used measures for interpreting machine learning with significant interest from both statistics and machine learning communities. Recently, increasing attention has been directed toward…
Variable importance is central to scientific studies, including the social sciences and causal inference, healthcare, and other domains. However, current notions of variable importance are often tied to a specific predictive model. This is…
The ability to interpret machine learning models has become increasingly important as their usage in data science continues to rise. Most current interpretability methods are optimized to work on either (\textit{i}) a global scale, where…
In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response -- in other words, to gauge the variable importance of features. Most recent work on…
In the era of "big data", it is becoming more of a challenge to not only build state-of-the-art predictive models, but also gain an understanding of what's really going on in the data. For example, it is often of interest to know which, if…
Explaining the decision process of machine learning algorithms is nowadays crucial for both model's performance enhancement and human comprehension. This can be achieved by assessing the variable importance of single variables, even for…
In many applied sciences a popular analysis strategy for high-dimensional data is to fit many multivariate generalized linear models in parallel. This paper presents a novel approach to address the resulting multiple testing problem by…
Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential…