Related papers: More Powerful Selective Kernel Tests for Feature S…

On Selecting and Conditioning in Multiple Testing and Selective Inference

We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven (sub)collection of hypotheses is chosen from some large universe of hypotheses.…

Methodology · Statistics 2024-04-09 Jelle Goeman , Aldo Solari

Kernel Feature Selection via Conditional Covariance Minimization

We propose a method for feature selection that employs kernel-based measures of independence to find a subset of covariates that is maximally predictive of the response. Building on past work in kernel dimension reduction, we show how to…

Machine Learning · Statistics 2018-10-23 Jianbo Chen , Mitchell Stern , Martin J. Wainwright , Michael I. Jordan

Selective inference after feature selection via multiscale bootstrap

It is common to show the confidence intervals or $p$-values of selected features, or predictor variables in regression, but they often involve selection bias. The selective inference approach solves this bias by conditioning on the…

Methodology · Statistics 2022-06-02 Yoshikazu Terada , Hidetoshi Shimodaira

Black-box Selective Inference via Bootstrapping

Conditional selective inference requires an exact characterization of the selection event, which is often unavailable except for a few examples like the lasso. This work addresses this challenge by introducing a generic approach to estimate…

Methodology · Statistics 2023-08-22 Sifan Liu , Jelena Markovic-Voronov , Jonathan Taylor

Improving Power by Conditioning on Less in Post-selection Inference for Changepoints

Post-selection inference has recently been proposed as a way of quantifying uncertainty about detected changepoints. The idea is to run a changepoint detection algorithm, and then re-use the same data to perform a test for a change near…

Methodology · Statistics 2026-05-11 Rachel Carrington , Paul Fearnhead

Feature Selection for multi-labeled variables via Dependency Maximization

Feature selection and reducing the dimensionality of data is an essential step in data analysis. In this work, we propose a new criterion for feature selection that is formulated as conditional information between features given the labeled…

Machine Learning · Statistics 2019-05-20 Salimeh Yasaei Sekeh , Alfred O. Hero

Selective Inference via Marginal Screening for High Dimensional Classification

Post-selection inference is a statistical technique for determining salient variables after model or variable selection. Recently, selective inference, a kind of post-selection inference framework, has garnered the attention in the…

Methodology · Statistics 2019-06-28 Yuta Umezu , Ichiro Takeuchi

Prediction-Powered Conditional Inference

We study prediction-powered conditional inference in the setting where labeled data are scarce, unlabeled covariates are abundant, and a black-box machine-learning predictor is available. The goal is to perform statistical inference on…

Machine Learning · Statistics 2026-03-09 Yang Sui , Jin Zhou , Hua Zhou , Xiaowu Dai

Selective Regression Under Fairness Criteria

Selective regression allows abstention from prediction if the confidence to make an accurate prediction is not sufficient. In general, by allowing a reject option, one expects the performance of a regression model to increase at the cost of…

Machine Learning · Computer Science 2022-07-18 Abhin Shah , Yuheng Bu , Joshua Ka-Wing Lee , Subhro Das , Rameswar Panda , Prasanna Sattigeri , Gregory W. Wornell

Inferring independent sets of Gaussian variables after thresholding correlations

We consider testing whether a set of Gaussian variables, selected from the data, is independent of the remaining variables. We assume that this set is selected via a very simple approach that is commonly used across scientific disciplines:…

Methodology · Statistics 2022-11-04 Arkajyoti Saha , Daniela Witten , Jacob Bien

On the Limitation of Kernel Dependence Maximization for Feature Selection

A simple and intuitive method for feature selection consists of choosing the feature subset that maximizes a nonparametric measure of dependence between the response and the features. A popular proposal from the literature uses the…

Machine Learning · Statistics 2024-06-12 Keli Liu , Feng Ruan

A Simple Way to Deal with Cherry-picking

Statistical hypothesis testing serves as statistical evidence for scientific innovation. However, if the reported results are intentionally biased, hypothesis testing no longer controls the rate of false discovery. In particular, we study…

Methodology · Statistics 2018-10-12 Junpei Komiyama , Takanori Maehara

Conditional predictive inference post model selection

We give a finite-sample analysis of predictive inference procedures after model selection in regression with random design. The analysis is focused on a statistically challenging scenario where the number of potentially important…

Statistics Theory · Mathematics 2009-08-26 Hannes Leeb

Inference after black box selection

We consider the problem of inference for parameters selected to report only after some algorithm, the canonical example being inference for model parameters after a model selection procedure. The conditional correction for selection…

Methodology · Statistics 2019-01-30 Jelena Markovic , Jonathan Taylor , Jeremy Taylor

Ask for More Than Bayes Optimal: A Theory of Indecisions for Classification

Selective classification is a powerful tool for automated decision-making in high-risk scenarios, allowing classifiers to act only when confident and abstain when uncertainty is high. Given a target accuracy, our goal is to minimize…

Statistics Theory · Mathematics 2025-10-28 Mohamed Ndaoud , Peter Radchenko , Bradley Rava

Learning to Increase the Power of Conditional Randomization Tests

The model-X conditional randomization test is a generic framework for conditional independence testing, unlocking new possibilities to discover features that are conditionally associated with a response of interest while controlling type-I…

Machine Learning · Computer Science 2023-02-21 Shalev Shaer , Yaniv Romano

Markov Blanket Ranking using Kernel-based Conditional Dependence Measures

Developing feature selection algorithms that move beyond a pure correlational to a more causal analysis of observational data is an important problem in the sciences. Several algorithms attempt to do so by discovering the Markov blanket of…

Machine Learning · Statistics 2014-05-06 Eric V. Strobl , Shyam Visweswaran

Multi-characteristic Subject Selection from Biased Datasets

Subject selection plays a critical role in experimental studies, especially ones with human subjects. Anecdotal evidence suggests that many such studies, done at or near university campus settings suffer from selection bias, i.e., the…

Machine Learning · Computer Science 2020-12-21 Tahereh Arabghalizi , Alexandros Labrinidis

Selective Inference Approach for Statistically Sound Predictive Pattern Mining

Discovering statistically significant patterns from databases is an important challenging problem. The main obstacle of this problem is in the difficulty of taking into account the selection bias, i.e., the bias arising from the fact that…

Machine Learning · Statistics 2016-03-10 Shinya Suzumura , Kazuya Nakagawa , Mahito Sugiyama , Koji Tsuda , Ichiro Takeuchi

Feature Selection via Mutual Information: New Theoretical Insights

Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables. However,…

Machine Learning · Computer Science 2019-07-18 Mario Beraha , Alberto Maria Metelli , Matteo Papini , Andrea Tirinzoni , Marcello Restelli