Related papers: Post-Hoc Large-Sample Statistical Inference
In traditional hypothesis testing one must pre-specify the significance level $\alpha$ to bound the `size' of the test: its probability to falsely reject the hypothesis. Indeed, a data-dependent selection of $\alpha$ would generally distort…
A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With p-values, it is not clear how to use an extreme observation…
Post-hoc interpretability methods are critical tools to explain neural-network results. Several post-hoc methods have emerged in recent years, but when applied to a given task, they produce different results, raising the question of which…
Likelihood-based methods of statistical inference provide a useful general methodology that is appealing, as a straightforward asymptotic theory can be applied for their implementation. It is important to assess the relationships between…
Applying standard statistical methods after model selection may yield inefficient estimators and hypothesis tests that fail to achieve nominal type-I error rates. The main issue is the fact that the post-selection distribution of the data…
The literature on hypothesis testing with data-dependent and post-hoc significance levels relies on a particular extension of the Type-I error to data-dependent levels. Existing arguments for this extension are heuristic, and primarily…
The validity of classical hypothesis testing requires the significance level $\alpha$ be fixed before any statistical analysis takes place. This is a stringent requirement. For instance, it prohibits updating $\alpha$ during (or after) an…
We study nonasymptotic (finite-sample) confidence intervals for treatment effects in randomized experiments. In the existing literature, the effective sample sizes of nonasymptotic confidence intervals tend to be looser than the…
Quality statistical inference requires a sufficient amount of data, which can be missing or hard to obtain. To this end, prediction-powered inference has risen as a promising methodology, but existing approaches are largely limited to…
In this paper, we study inference for high-dimensional data characterized by small sample sizes relative to the dimension of the data. In particular, we provide an infinite-dimensional framework to study statistical models that involve…
We develop methodology for valid inference after variable selection in logistic regression when the responses are partially observed, that is, when one observes a set of error-prone testing outcomes instead of the true values of the…
It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides…
We consider parameter estimation, hypothesis testing and variable selection for partially time-varying coefficient models. Our asymptotic theory has the useful feature that it can allow dependent, nonstationary error and covariate…
A/B tests are typically analyzed via frequentist p-values and confidence intervals; but these inferences are wholly unreliable if users endogenously choose samples sizes by *continuously monitoring* their tests. We define *always valid*…
We examine the role of trustworthiness and trust in statistical inference, arguing that it is the extent of trustworthiness in inferential statistical tools which enables trust in the conclusions. Certain tools, such as the p-value and…
Simultaneous inference allows for the exploration of data while deciding on criteria for proclaiming discoveries. It was recently proved that all admissible post-hoc inference methods for true discoveries must employ closed testing. In this…
In modern data analysis, it is common to select a model before performing statistical inference. Selective inference tools make adjustments for the model selection process in order to ensure reliable inference post selection. In this paper,…
A fundamental assumption of classical hypothesis testing is that the significance threshold $\alpha$ is chosen independently from the data. The validity of confidence intervals likewise relies on choosing $\alpha$ beforehand. We point out…
Uncertainty quantification is critical in safety-sensitive applications but is often omitted from off-the-shelf neural networks due to adverse effects on predictive performance. Retrofitting uncertainty estimates post-hoc typically requires…
Machine learning models are widely applied in various fields. Stakeholders often use post-hoc feature importance methods to better understand the input features' contribution to the models' predictions. The interpretation of the importance…