Related papers: Variable importance scores

Balancing Unobserved Confounding with a Few Unbiased Ratings in Debiased Recommendations

Recommender systems are seen as an effective tool to address information overload, but it is widely known that the presence of various biases makes direct training on large-scale observational data result in sub-optimal prediction…

Information Retrieval · Computer Science 2023-04-19 Haoxuan Li , Yanghao Xiao , Chunyuan Zheng , Peng Wu

A principled approach for comparing Variable Importance

Variable importance measures (VIMs) aim to quantify the contribution of each input covariate to the predictability of a given output. With the growing interest in explainable AI, numerous VIMs have been proposed, many of which are heuristic…

Methodology · Statistics 2025-09-23 Angel Reyero-Lobo , Pierre Neuvial , Bertrand Thirion

Score Matching With Missing Data

Score matching is a vital tool for learning the distribution of data with applications across many areas including diffusion processes, energy based modelling, and graphical model estimation. Despite all these applications, little work…

Machine Learning · Statistics 2025-06-03 Josh Givens , Song Liu , Henry W J Reeve

Challenges in Variable Importance Ranking Under Correlation

Variable importance plays a pivotal role in interpretable machine learning as it helps measure the impact of factors on the output of the prediction model. Model agnostic methods based on the generation of "null" features via permutation…

Machine Learning · Statistics 2024-02-07 Annie Liang , Thomas Jemielita , Andy Liaw , Vladimir Svetnik , Lingkang Huang , Richard Baumgartner , Jason M. Klusowski

A Simple and Effective Model-Based Variable Importance Measure

In the era of "big data", it is becoming more of a challenge to not only build state-of-the-art predictive models, but also gain an understanding of what's really going on in the data. For example, it is often of interest to know which, if…

Machine Learning · Statistics 2018-05-15 Brandon M. Greenwell , Bradley C. Boehmke , Andrew J. McCarthy

Fighting Noise with Noise: Causal Inference with Many Candidate Instruments

Instrumental variable methods provide useful tools for inferring causal effects in the presence of unmeasured confounding. To apply these methods with large-scale data sets, a major challenge is to find valid instruments from a possibly…

Methodology · Statistics 2024-09-24 Xinyi Zhang , Linbo Wang , Stanislav Volgushev , Dehan Kong

Comparing interpretability and explainability for feature selection

A common approach for feature selection is to examine the variable importance scores for a machine learning model, as a way to understand which features are the most relevant for making predictions. Given the significance of feature…

Machine Learning · Computer Science 2021-05-13 Jack Dunn , Luca Mingardi , Ying Daisy Zhuo

A general framework for inference on algorithm-agnostic variable importance

In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response -- in other words, to gauge the variable importance of features. Most recent work on…

Methodology · Statistics 2025-10-23 Brian D. Williamson , Peter B. Gilbert , Noah R. Simon , Marco Carone

Systematic Evaluation of Predictive Fairness

Mitigating bias in training on biased datasets is an important open problem. Several techniques have been proposed, however the typical evaluation regime is very limited, considering very narrow data conditions. For instance, the effect of…

Machine Learning · Computer Science 2022-10-18 Xudong Han , Aili Shen , Trevor Cohn , Timothy Baldwin , Lea Frermann

On the Necessity of Irrelevant Variables

This work explores the effects of relevant and irrelevant boolean variables on the accuracy of classifiers. The analysis uses the assumption that the variables are conditionally independent given the class, and focuses on a natural family…

Machine Learning · Computer Science 2012-06-12 David P. Helmbold , Philip M. Long

Reward-estimation variance elimination in sequential decision processes

Policy gradient methods are very attractive in reinforcement learning due to their model-free nature and convergence guarantees. These methods, however, suffer from high variance in gradient estimation, resulting in poor sample efficiency.…

Machine Learning · Computer Science 2018-11-16 Sergey Pankov

Score-Based Causal Discovery of Latent Variable Causal Models

Identifying latent variables and the causal structure involving them is essential across various scientific fields. While many existing works fall under the category of constraint-based methods (with e.g. conditional independence or rank…

Machine Learning · Computer Science 2026-05-21 Ignavier Ng , Xinshuai Dong , Haoyue Dai , Biwei Huang , Peter Spirtes , Kun Zhang

The Power of Unbiased Recursive Partitioning: A Unifying View of CTree, MOB, and GUIDE

A core step of every algorithm for learning regression trees is the selection of the best splitting variable from the available covariates and the corresponding split point. Early tree algorithms (e.g., AID, CART) employed greedy search…

Methodology · Statistics 2019-06-26 Lisa Schlosser , Torsten Hothorn , Achim Zeileis

Unbiased Measurement of Feature Importance in Tree-Based Methods

We propose a modification that corrects for split-improvement variable importance measures in Random Forests and other tree-based methods. These methods have been shown to be biased towards increasing the importance of features with more…

Machine Learning · Statistics 2020-03-25 Zhengze Zhou , Giles Hooker

Assessing variable importance in survival analysis using machine learning

Given a collection of features available for inclusion in a predictive model, it may be of interest to quantify the relative importance of a subset of features for the prediction task at hand. For example, in HIV vaccine trials, participant…

Methodology · Statistics 2025-03-27 Charles J. Wolock , Peter B. Gilbert , Noah Simon , Marco Carone

Variable importance without impossible data

The most popular methods for measuring importance of the variables in a black box prediction algorithm make use of synthetic inputs that combine predictor variables from multiple subjects. These inputs can be unlikely, physically…

Machine Learning · Computer Science 2023-04-14 Masayoshi Mase , Art B. Owen , Benjamin B. Seiler

Unbiased Pairwise Learning from Implicit Feedback for Recommender Systems without Biased Variance Control

Generally speaking, the model training for recommender systems can be based on two types of data, namely explicit feedback and implicit feedback. Moreover, because of its general availability, we see wide adoption of implicit feedback data,…

Information Retrieval · Computer Science 2023-04-17 Yi Ren , Hongyan Tang , Jiangpeng Rong , Siwen Zhu

Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback

In most real-world recommender systems, the observed rating data are subject to selection bias, and the data are thus missing-not-at-random. Developing a method to facilitate the learning of a recommender with biased feedback is one of the…

Social and Information Networks · Computer Science 2022-06-16 Yuta Saito

Combating Unknown Bias with Effective Bias-Conflicting Scoring and Gradient Alignment

Models notoriously suffer from dataset biases which are detrimental to robustness and generalization. The identify-emphasize paradigm shows a promising effect in dealing with unknown biases. However, we find that it is still plagued by two…

Machine Learning · Computer Science 2022-11-29 Bowen Zhao , Chen Chen , Qian-Wei Wang , Anfeng He , Shu-Tao Xia

Feature Importance Disparities for Data Bias Investigations

It is widely held that one cause of downstream bias in classifiers is bias present in the training data. Rectifying such biases may involve context-dependent interventions such as training separate models on subgroups, removing features…

Machine Learning · Computer Science 2024-06-04 Peter W. Chang , Leor Fishman , Seth Neel