Related papers: Post-Hoc Large-Sample Statistical Inference

Post-hoc $\alpha$ Hypothesis Testing and the Post-hoc $p$-value

In traditional hypothesis testing one must pre-specify the significance level $\alpha$ to bound the `size' of the test: its probability to falsely reject the hypothesis. Indeed, a data-dependent selection of $\alpha$ would generally distort…

Statistics Theory · Mathematics 2025-12-03 Nick W. Koning

Beyond Neyman-Pearson: e-values enable hypothesis testing with a data-driven alpha

A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With p-values, it is not clear how to use an extreme observation…

Methodology · Statistics 2024-04-04 Peter Grünwald

Evaluation of post-hoc interpretability methods in time-series classification

Post-hoc interpretability methods are critical tools to explain neural-network results. Several post-hoc methods have emerged in recent years, but when applied to a given task, they produce different results, raising the question of which…

Machine Learning · Computer Science 2024-12-09 Hugues Turbé , Mina Bjelogrlic , Christian Lovis , Gianmarco Mengaldo

Stability and uniqueness of $p$-values for likelihood-based inference

Likelihood-based methods of statistical inference provide a useful general methodology that is appealing, as a straightforward asymptotic theory can be applied for their implementation. It is important to assess the relationships between…

Statistics Theory · Mathematics 2015-03-20 Thomas J. DiCiccio , Todd A. Kuffner , G. Alastair Young , Russell Zaretzki

Tractable Post-Selection Maximum Likelihood Inference for the Lasso

Applying standard statistical methods after model selection may yield inefficient estimators and hypothesis tests that fail to achieve nominal type-I error rates. The main issue is the fact that the post-selection distribution of the data…

Methodology · Statistics 2019-05-23 Amit Meir , Mathias Drton

The 'Right' Extension of Type-I Error to Data-Dependent Levels

The literature on hypothesis testing with data-dependent and post-hoc significance levels relies on a particular extension of the Type-I error to data-dependent levels. Existing arguments for this extension are heuristic, and primarily…

Statistics Theory · Mathematics 2026-05-28 Nick W. Koning

On admissibility in post-hoc hypothesis testing

The validity of classical hypothesis testing requires the significance level $\alpha$ be fixed before any statistical analysis takes place. This is a stringent requirement. For instance, it prohibits updating $\alpha$ during (or after) an…

Statistics Theory · Mathematics 2026-01-21 Ben Chugg , Tyron Lardy , Aaditya Ramdas , Peter Grünwald

On Nonasymptotic Confidence Intervals for Treatment Effects in Randomized Experiments

We study nonasymptotic (finite-sample) confidence intervals for treatment effects in randomized experiments. In the existing literature, the effective sample sizes of nonasymptotic confidence intervals tend to be looser than the…

Methodology · Statistics 2026-01-26 Ricardo J. Sandoval , Sivaraman Balakrishnan , Avi Feller , Michael I. Jordan , Ian Waudby-Smith

Prediction-Powered E-Values

Quality statistical inference requires a sufficient amount of data, which can be missing or hard to obtain. To this end, prediction-powered inference has risen as a promising methodology, but existing approaches are largely limited to…

Machine Learning · Statistics 2025-05-27 Daniel Csillag , Claudio José Struchiner , Guilherme Tegoni Goedert

Asymptotic inference for high-dimensional data

In this paper, we study inference for high-dimensional data characterized by small sample sizes relative to the dimension of the data. In particular, we provide an infinite-dimensional framework to study statistical models that involve…

Statistics Theory · Mathematics 2010-02-25 Jim Kuelbs , Anand N. Vidyashankar

Post-selection Inference in Regression Models for Group Testing Data

We develop methodology for valid inference after variable selection in logistic regression when the responses are partially observed, that is, when one observes a set of error-prone testing outcomes instead of the true values of the…

Methodology · Statistics 2025-04-17 Qinyan Shen , Karl Gregory , Xianzheng Huang

Valid post-selection inference

It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides…

Statistics Theory · Mathematics 2013-06-06 Richard Berk , Lawrence Brown , Andreas Buja , Kai Zhang , Linda Zhao

Inference of time-varying regression models

We consider parameter estimation, hypothesis testing and variable selection for partially time-varying coefficient models. Our asymptotic theory has the useful feature that it can allow dependent, nonstationary error and covariate…

Statistics Theory · Mathematics 2012-08-20 Ting Zhang , Wei Biao Wu

Always Valid Inference: Bringing Sequential Analysis to A/B Testing

A/B tests are typically analyzed via frequentist p-values and confidence intervals; but these inferences are wholly unreliable if users endogenously choose samples sizes by *continuously monitoring* their tests. We define *always valid*…

Statistics Theory · Mathematics 2019-07-18 Ramesh Johari , Leo Pekelis , David J. Walsh

Trustworthiness of statistical inference

We examine the role of trustworthiness and trust in statistical inference, arguing that it is the extent of trustworthiness in inferential statistical tools which enables trust in the conclusions. Certain tools, such as the p-value and…

Methodology · Statistics 2021-05-11 David J. Hand

Large-scale simultaneous inference under dependence

Simultaneous inference allows for the exploration of data while deciding on criteria for proclaiming discoveries. It was recently proved that all admissible post-hoc inference methods for true discoveries must employ closed testing. In this…

Statistics Theory · Mathematics 2022-03-24 Jinjin Tian , Xu Chen , Eugene Katsevich , Jelle Goeman , Aaditya Ramdas

Asymptotically-exact selective inference for quantile regression

In modern data analysis, it is common to select a model before performing statistical inference. Selective inference tools make adjustments for the model selection process in order to ensure reliable inference post selection. In this paper,…

Methodology · Statistics 2025-02-24 Yumeng Wang , Snigdha Panigrahi , Xuming He

Choosing alpha post hoc: the danger of multiple standard significance thresholds

A fundamental assumption of classical hypothesis testing is that the significance threshold $\alpha$ is chosen independently from the data. The validity of confidence intervals likewise relies on choosing $\alpha$ beforehand. We point out…

Applications · Statistics 2025-03-11 Jesse Hemerik , Nick W Koning

Principled Input-Output-Conditioned Post-Hoc Uncertainty Estimation for Regression Networks

Uncertainty quantification is critical in safety-sensitive applications but is often omitted from off-the-shelf neural networks due to adverse effects on predictive performance. Retrofitting uncertainty estimates post-hoc typically requires…

Machine Learning · Computer Science 2025-06-03 Lennart Bramlage , Cristóbal Curio

Confident Feature Ranking

Machine learning models are widely applied in various fields. Stakeholders often use post-hoc feature importance methods to better understand the input features' contribution to the models' predictions. The interpretation of the importance…

Machine Learning · Statistics 2024-04-19 Bitya Neuhof , Yuval Benjamini