Related papers: Understanding complex predictive models with Ghost…

The Importance of Variable Importance

Variable importance is defined as a measure of each regressor's contribution to model fit. Using R^2 as the fit criterion in linear models leads to the Shapley value (LMG) and proportionate value (PMVD) as variable importance measures.…

Methodology · Statistics 2022-12-08 Charles D. Coleman

The BP Dependency Function: a Generic Measure of Dependence between Random Variables

Measuring and quantifying dependencies between random variables (RV's) can give critical insights into a data-set. Typical questions are: `Do underlying relationships exist?', `Are some variables redundant?', and `Is some target variable…

Machine Learning · Statistics 2022-03-24 Guus Berkelmans , Joris Pries , Sandjai Bhulai , Rob van der Mei

Understanding Global Feature Contributions With Additive Importance Measures

Understanding the inner workings of complex machine learning models is a long-standing problem and most recent research has focused on local interpretability. To assess the role of individual input features in a global sense, we explore the…

Machine Learning · Computer Science 2020-10-28 Ian Covert , Scott Lundberg , Su-In Lee

Evaluation of Similarity-based Explanations

Explaining the predictions made by complex machine learning models helps users to understand and accept the predicted outputs with confidence. One promising way is to use similarity-based explanation that provides similar instances as…

Machine Learning · Computer Science 2021-03-24 Kazuaki Hanawa , Sho Yokoi , Satoshi Hara , Kentaro Inui

Variable Importance Clouds: A Way to Explore Variable Importance for the Set of Good Models

Variable importance is central to scientific studies, including the social sciences and causal inference, healthcare, and other domains. However, current notions of variable importance are often tied to a specific predictive model. This is…

Machine Learning · Statistics 2020-02-11 Jiayun Dong , Cynthia Rudin

On the implied weights of linear regression for causal inference

A basic principle in the design of observational studies is to approximate the randomized experiment that would have been conducted under controlled circumstances. Now, linear regression models are commonly used to analyze observational…

Methodology · Statistics 2022-07-08 Ambarish Chattopadhyay , Jose R. Zubizarreta

A Simple and Effective Model-Based Variable Importance Measure

In the era of "big data", it is becoming more of a challenge to not only build state-of-the-art predictive models, but also gain an understanding of what's really going on in the data. For example, it is often of interest to know which, if…

Machine Learning · Statistics 2018-05-15 Brandon M. Greenwell , Bradley C. Boehmke , Andrew J. McCarthy

Individualized Conformal

The problem of individualized prediction can be addressed using variants of conformal prediction, obtaining the intervals to which the actual values of the variables of interest belong. Here we present a method based on detecting the…

Methodology · Statistics 2023-04-12 Fernando Delbianco , Fernando Tohmé

Distance Metrics for Measuring Joint Dependence with Application to Causal Inference

Many statistical applications require the quantification of joint dependence among more than two random vectors. In this work, we generalize the notion of distance covariance to quantify joint dependence among d >= 2 random vectors. We…

Methodology · Statistics 2018-06-18 Shubhadeep Chakraborty , Xianyang Zhang

Evaluating Explainability in Machine Learning Predictions through Explainer-Agnostic Metrics

The rapid integration of artificial intelligence (AI) into various industries has introduced new challenges in governance and regulation, particularly regarding the understanding of complex AI systems. A critical demand from decision-makers…

Machine Learning · Computer Science 2024-11-08 Cristian Munoz , Kleyton da Costa , Bernardo Modenesi , Adriano Koshiyama

All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously

Variable importance (VI) tools describe how much covariates contribute to a prediction model's accuracy. However, important variables for one well-performing model (for example, a linear model $f(\mathbf{x})=\mathbf{x}^{T}\beta$ with a…

Methodology · Statistics 2019-12-24 Aaron Fisher , Cynthia Rudin , Francesca Dominici

Randomized Ablation Feature Importance

Given a model $f$ that predicts a target $y$ from a vector of input features $\pmb{x} = x_1, x_2, \ldots, x_M$, we seek to measure the importance of each feature with respect to the model's ability to make a good prediction. To this end, we…

Machine Learning · Computer Science 2019-10-03 Luke Merrick

Interpreting random forest classification models using a feature contribution method

Model interpretation is one of the key aspects of the model evaluation process. The explanation of the relationship between model variables and outputs is relatively easy for statistical models, such as linear regressions, thanks to the…

Machine Learning · Computer Science 2013-12-05 Anna Palczewska , Jan Palczewski , Richard Marchese Robinson , Daniel Neagu

Predictive learning via rule ensembles

General regression and classification models are constructed as linear combinations of simple rules derived from the data. Each rule consists of a conjunction of a small number of simple statements concerning the values of individual input…

Applications · Statistics 2008-11-12 Jerome H. Friedman , Bogdan E. Popescu

Joint Concordance Index

Existing metrics in competing risks survival analysis such as concordance and accuracy do not evaluate a model's ability to jointly predict the event type and the event time. To address these limitations, we propose a new metric, which we…

Methodology · Statistics 2019-08-20 Kartik Ahuja , Mihaela van der Schaar

Variable selection for Gaussian processes via sensitivity analysis of the posterior predictive distribution

Variable selection for Gaussian process models is often done using automatic relevance determination, which uses the inverse length-scale parameter of each input variable as a proxy for variable relevance. This implicitly determined…

Methodology · Statistics 2019-04-24 Topi Paananen , Juho Piironen , Michael Riis Andersen , Aki Vehtari

Testing for the Important Components of Posterior Predictive Variance

We give a decomposition of the posterior predictive variance using the law of total variance and conditioning on a finite dimensional discrete random variable. This random variable summarizes various features of modeling that are used to…

Methodology · Statistics 2022-09-02 Dean Dustin , Bertrand Clarke

Lazy Estimation of Variable Importance for Large Neural Networks

As opaque predictive models increasingly impact many areas of modern life, interest in quantifying the importance of a given input variable for making a specific prediction has grown. Recently, there has been a proliferation of…

Machine Learning · Statistics 2022-07-20 Yue Gao , Abby Stevens , Rebecca Willet , Garvesh Raskutti

An Interpretable Measure for Quantifying Predictive Dependence between Continuous Random Variables -- Extended Version

A fundamental task in statistical learning is quantifying the joint dependence or association between two continuous random variables. We introduce a novel, fully non-parametric measure that assesses the degree of association between…

Machine Learning · Computer Science 2025-01-22 Renato Assunção , Flávio Figueiredo , Francisco N. Tinoco Júnior , Léo M. de Sá-Freire , Fábio Silva

Joint density of eigenvalues in spiked multivariate models

The classical methods of multivariate analysis are based on the eigenvalues of one or two sample covariance matrices. In many applications of these methods, for example to high dimensional data, it is natural to consider alternative…

Statistics Theory · Mathematics 2014-06-17 Prathapasinghe Dharmawansa , Iain M. Johnstone