Related papers: Conditional predictive inference post model select…
In clinical trials, inferences on clinical outcomes are often made conditional on specific selective processes. For instance, only when a treatment demonstrates a significant effect on the primary outcome, further analysis is conducted to…
This paper explores the challenges of constructing suitable inferential models in scenarios where the parameter of interest is determined in light of the data, such as regression after variable selection. Two compelling arguments for…
Statistical inference after model selection requires an inference framework that takes the selection into account in order to be valid. Following recent work on selective inference, we derive analytical expressions for inference after…
In this article, we review selective inference, a set of techniques for inference when the statistical question asked is a function of the data. This setting often arises in contemporary scientific workflows, where hypotheses and parameters…
Variable selection for regression models plays a key role in the analysis of biomedical data. However, inference after selection is not covered by classical statistical frequentist theory which assumes a fixed set of covariates in the…
The inferential model (IM) framework provides valid prior-free probabilistic inference by focusing on predicting unobserved auxiliary variables. But, efficient IM-based inference can be challenging when the auxiliary variable is of higher…
We investigate generically applicable and intuitively appealing prediction intervals based on $k$-fold cross validation. We focus on the conditional coverage probability of the proposed intervals, given the observations in the training…
Conditional probabilities are a core concept in machine learning. For example, optimal prediction of a label $Y$ given an input $X$ corresponds to maximizing the conditional probability of $Y$ given $X$. A common approach to inference tasks…
We develop a general approach to valid inference after model selection. At the core of our framework is a result that characterizes the distribution of a post-selection estimator conditioned on the selection event. We specialize the…
Investigators often use the data to generate interesting hypotheses and then perform inference for the generated hypotheses. P-values and confidence intervals must account for this explorative data analysis. A fruitful method for doing so…
This work addresses the problem of conducting valid inference for additive and linear mixed models after model selection. One possible solution to overcome overconfident inference results after model selection is selective inference, which…
We consider an empirical likelihood framework for inference for a statistical model based on an informative sampling design. Covariate information is incorporated both through the weights and the estimating equations. The estimator is based…
Selective inference (post-selection inference) is a methodology that has attracted much attention in recent years in the fields of statistics and machine learning. Naive inference based on data that are also used for model selection tends…
We consider the problem of inference for parameters selected to report only after some algorithm, the canonical example being inference for model parameters after a model selection procedure. The conditional correction for selection…
Conformal predictors are machine learning algorithms that output prediction sets that have a guarantee of marginal validity for finite samples with minimal distributional assumptions. This is a property that makes conformal predictors…
Parameters of sub-populations can be more relevant than super-population ones. For example, a healthcare provider may be interested in the effect of a treatment plan for a specific subset of their patients; policymakers may be concerned…
Valid inference after model selection is currently a very active area of research. The polyhedral method, pioneered by Lee, et al. (2016), allows for valid inference after model selection if the model selection event can be described by…
Intelligent test requires efficient and effective analysis of high-dimensional data in a large scale. Traditionally, the analysis is often conducted by human experts, but it is not scalable in the era of big data. To tackle this challenge,…
We consider the problem of estimating the conditional distribution of a post-model-selection estimator where the conditioning is on the selected model. The notion of a post-model-selection estimator here refers to the combined procedure…
We examine the conditions under which descriptive inference can be based directly on the observed distribution in a non-probability sample, under both the super-population and quasi-randomisation modelling approaches. Review of existing…