Related papers: More Powerful Selective Kernel Tests for Feature S…
We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven (sub)collection of hypotheses is chosen from some large universe of hypotheses.…
We propose a method for feature selection that employs kernel-based measures of independence to find a subset of covariates that is maximally predictive of the response. Building on past work in kernel dimension reduction, we show how to…
It is common to show the confidence intervals or $p$-values of selected features, or predictor variables in regression, but they often involve selection bias. The selective inference approach solves this bias by conditioning on the…
Conditional selective inference requires an exact characterization of the selection event, which is often unavailable except for a few examples like the lasso. This work addresses this challenge by introducing a generic approach to estimate…
Post-selection inference has recently been proposed as a way of quantifying uncertainty about detected changepoints. The idea is to run a changepoint detection algorithm, and then re-use the same data to perform a test for a change near…
Feature selection and reducing the dimensionality of data is an essential step in data analysis. In this work, we propose a new criterion for feature selection that is formulated as conditional information between features given the labeled…
Post-selection inference is a statistical technique for determining salient variables after model or variable selection. Recently, selective inference, a kind of post-selection inference framework, has garnered the attention in the…
We study prediction-powered conditional inference in the setting where labeled data are scarce, unlabeled covariates are abundant, and a black-box machine-learning predictor is available. The goal is to perform statistical inference on…
Selective regression allows abstention from prediction if the confidence to make an accurate prediction is not sufficient. In general, by allowing a reject option, one expects the performance of a regression model to increase at the cost of…
We consider testing whether a set of Gaussian variables, selected from the data, is independent of the remaining variables. We assume that this set is selected via a very simple approach that is commonly used across scientific disciplines:…
A simple and intuitive method for feature selection consists of choosing the feature subset that maximizes a nonparametric measure of dependence between the response and the features. A popular proposal from the literature uses the…
Statistical hypothesis testing serves as statistical evidence for scientific innovation. However, if the reported results are intentionally biased, hypothesis testing no longer controls the rate of false discovery. In particular, we study…
We give a finite-sample analysis of predictive inference procedures after model selection in regression with random design. The analysis is focused on a statistically challenging scenario where the number of potentially important…
We consider the problem of inference for parameters selected to report only after some algorithm, the canonical example being inference for model parameters after a model selection procedure. The conditional correction for selection…
Selective classification is a powerful tool for automated decision-making in high-risk scenarios, allowing classifiers to act only when confident and abstain when uncertainty is high. Given a target accuracy, our goal is to minimize…
The model-X conditional randomization test is a generic framework for conditional independence testing, unlocking new possibilities to discover features that are conditionally associated with a response of interest while controlling type-I…
Developing feature selection algorithms that move beyond a pure correlational to a more causal analysis of observational data is an important problem in the sciences. Several algorithms attempt to do so by discovering the Markov blanket of…
Subject selection plays a critical role in experimental studies, especially ones with human subjects. Anecdotal evidence suggests that many such studies, done at or near university campus settings suffer from selection bias, i.e., the…
Discovering statistically significant patterns from databases is an important challenging problem. The main obstacle of this problem is in the difficulty of taking into account the selection bias, i.e., the bias arising from the fact that…
Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables. However,…