Related papers: Selective inference with a randomized response

Splitting strategies for post-selection inference

We consider the problem of providing valid inference for a selected parameter in a sparse regression setting. It is well known that classical regression tools can be unreliable in this context due to the bias generated in the selection…

Methodology · Statistics 2022-12-07 Daniel G. Rasines , G. Alastair Young

Randomization Inference: Theory and Applications

We review approaches to statistical inference based on randomization. Permutation tests are treated as an important special case. Under a certain group invariance property, referred to as the ``randomization hypothesis,'' randomization…

Econometrics · Economics 2025-02-05 David M. Ritzwoller , Joseph P. Romano , Azeem M. Shaikh

Methods of Selective Inference for Linear Mixed Models: a Review and Empirical Comparison

Selective inference aims at providing valid inference after a data-driven selection of models or hypotheses. It is essential to avoid overconfident results and replicability issues. While significant advances have been made in this area for…

Methodology · Statistics 2025-03-14 Matteo D'Alessandro , Magne Thoresen

Selective Randomization Inference for Adaptive Experiments

Adaptive experiments use preliminary analyses of the data to inform further course of action and are commonly used in many disciplines including medical and social sciences. Because the null hypothesis and experimental design are…

Methodology · Statistics 2026-05-26 Tobias Freidling , Qingyuan Zhao , Zijun Gao

Training and Testing with Multiple Splits: A Central Limit Theorem for Split-Sample Estimators

As predictive algorithms grow in popularity, using the same dataset to both train and test a new model has become routine across research, policy, and industry. Sample-splitting attains valid inference on model properties by using separate…

Econometrics · Economics 2025-11-27 Bruno Fava

Inference from Small and Big Data Sets with Error Rates

In this paper we introduce randomized $t$-type statistics that will be referred to as randomized pivots. We show that these randomized pivots yield central limit theorems with a significantly smaller magnitude of error as compared to that…

Methodology · Statistics 2014-04-24 Miklos Csorgo , Masoud M Nasari

Locally Simultaneous Inference

Selective inference is the problem of giving valid answers to statistical questions chosen in a data-driven manner. A standard solution to selective inference is simultaneous inference, which delivers valid answers to the set of all…

Methodology · Statistics 2024-05-03 Tijana Zrnic , William Fithian

Randomization Inference Tests for Shift-Share Designs

We consider the problem of inference in shift-share research designs. The choice between existing approaches that allow for unrestricted spatial correlation involves tradeoffs, varying in terms of their validity when there are relatively…

Econometrics · Economics 2022-06-03 Luis Alvarez , Bruno Ferman , Raoni Oliveira

Selective inference using randomized group lasso estimators for general models

Selective inference methods are developed for group lasso estimators for use with a wide class of distributions and loss functions. The method includes the use of exponential family distributions, as well as quasi-likelihood modeling for…

Methodology · Statistics 2024-03-28 Yiling Huang , Sarah Pirenne , Snigdha Panigrahi , Gerda Claeskens

Selective inference is easier with p-values

Selective inference is a subfield of statistics that enables valid inference after selection of a data-dependent question. In this paper, we introduce selectively dominant p-values, a class of p-values that allow practitioners to easily…

Methodology · Statistics 2024-11-22 Anav Sood

On Selecting and Conditioning in Multiple Testing and Selective Inference

We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven (sub)collection of hypotheses is chosen from some large universe of hypotheses.…

Methodology · Statistics 2024-04-09 Jelle Goeman , Aldo Solari

Enhancing Statistical Validity and Power in Hybrid Controlled Trials: A Randomization Inference Approach with Conformal Selective Borrowing

External controls from historical trials or observational data can augment randomized controlled trials when large-scale randomization is impractical or unethical, such as in drug evaluation for rare diseases. However, non-randomized…

Methodology · Statistics 2025-05-08 Ke Zhu , Shu Yang , Xiaofei Wang

General forms of finite population central limit theorems with applications to causal inference

Frequentists' inference often delivers point estimators associated with confidence intervals or sets for parameters of interest. Constructing the confidence intervals or sets requires understanding the sampling distributions of the point…

Statistics Theory · Mathematics 2016-10-18 Xinran Li , Peng Ding

Asymptotically-exact selective inference for quantile regression

In modern data analysis, it is common to select a model before performing statistical inference. Selective inference tools make adjustments for the model selection process in order to ensure reliable inference post selection. In this paper,…

Methodology · Statistics 2025-02-24 Yumeng Wang , Snigdha Panigrahi , Xuming He

Randomization Tests for Adaptively Collected Data

Randomization testing is a fundamental method in statistics, enabling inferential tasks such as testing for (conditional) independence of random variables, constructing confidence intervals in semiparametric location models, and…

Methodology · Statistics 2023-03-21 Yash Nair , Lucas Janson

Effect Inference from Two-Group Data with Sampling Bias

In many applications, different populations are compared using data that are sampled in a biased manner. Under sampling biases, standard methods that estimate the difference between the population means yield unreliable inferences. Here we…

Statistics Theory · Mathematics 2019-11-12 Dave Zachariah , Petre Stoica

A New Analysis of Differential Privacy's Generalization Guarantees

We give a new proof of the "transfer theorem" underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sample-accurate is also accurate out-of-sample. Our…

Machine Learning · Computer Science 2024-06-05 Christopher Jung , Katrina Ligett , Seth Neel , Aaron Roth , Saeed Sharifi-Malvajerdi , Moshe Shenfeld

Selective Inference with Distributed Data

As datasets grow larger, they are often distributed across multiple machines that compute in parallel and communicate with a central machine through short messages. In this paper, we focus on sparse regression and propose a new procedure…

Methodology · Statistics 2023-03-14 Sifan Liu , Snigdha Panigrahi

Rank-transformed subsampling: inference for multiple data splitting and exchangeable p-values

Many testing problems are readily amenable to randomised tests such as those employing data splitting. However despite their usefulness in principle, randomised tests have obvious drawbacks. Firstly, two analyses of the same dataset may…

Methodology · Statistics 2024-09-05 F. Richard Guo , Rajen D. Shah

Categorize and randomize: a permissive model of stochastic choice

We model stochastic choices with categorization. The agent preliminarly groups alternatives in homogenous disjoint classes, then randomly chooses one class and randomly picks an item within the selected class. We give a formal definition of…

Theoretical Economics · Economics 2026-01-06 Ester Sudano