Related papers: Exact Selective Inference with Randomization
Several strategies have been developed recently to ensure valid inference after model selection; some of these are easy to compute, while others fare better in terms of inferential power. In this paper, we consider a selective inference…
In finite population causal inference exact randomization tests can be constructed for sharp null hypotheses, i.e. hypotheses which fully impute the missing potential outcomes. Oftentimes inference is instead desired for the weak null that…
In modern data analysis, it is common to select a model before performing statistical inference. Selective inference tools make adjustments for the model selection process in order to ensure reliable inference post selection. In this paper,…
Inference after model selection has been an active research topic in the past few years, with numerous works offering different approaches to addressing the perils of the reuse of data. In particular, major progress has been made recently…
We consider a Bayesian approach to variable selection in the presence of high dimensional covariates based on a hierarchical model that places prior distributions on the regression coefficients as well as on the model space. We adopt the…
We study inference with a small labeled sample, a large unlabeled sample, and high-quality predictions from an external model. We link prediction-powered inference with empirical likelihood by stacking supervised estimating equations based…
Preferential sampling is a common feature in geostatistics and occurs when the locations to be sampled are chosen based on information about the phenomena under study. In this case, point pattern models are commonly used as the probability…
Modeled along the truncated approach in Panigrahi (2016), selection-adjusted inference in a Bayesian regime is based on a selective posterior. Such a posterior is determined together by a generative model imposed on data and the selection…
In this paper, we develop a general theory of truncated inverse binomial sampling. In this theory, the fixed-size sampling and inverse binomial sampling are accommodated as special cases. In particular, the classical Chernoff-Hoeffding…
We develop a method to generate prediction sets with a guaranteed coverage rate that is robust to corruptions in the training data, such as missing or noisy variables. Our approach builds on conformal prediction, a powerful framework to…
We consider inference on a scalar regression coefficient under a constraint on the magnitude of the control coefficients. A class of estimators based on a regularized propensity score regression is shown to exactly solve a tradeoff between…
In recent years, there has been a growing interest in statistical methods that exhibit robust performance under distribution changes between training and test data. While most of the related research focuses on point predictions with the…
We develop a Monte Carlo-free approach to inference post output from randomized algorithms with a convex loss and a convex penalty. The pivotal statistic based on a truncated law, called the selective pivot, usually lacks closed form…
Vanilla variational inference finds an optimal approximation to the Bayesian posterior distribution, but even the exact Bayesian posterior is often not meaningful under model misspecification. We propose predictive variational inference…
In this paper, we tackle the resolution of chance-constrained problems reformulated via Sample Average Approximation. The resulting data-driven deterministic reformulation takes the form of a large-scale mixed-integer program cursed with…
Adaptive experiments use preliminary analyses of the data to inform further course of action and are commonly used in many disciplines including medical and social sciences. Because the null hypothesis and experimental design are…
Bayesian models quantify uncertainty and facilitate optimal decision-making in downstream applications. For most models, however, practitioners are forced to use approximate inference techniques that lead to sub-optimal decisions due to…
Probabilistic programs are typically normal-looking programs describing posterior probability distributions. They intrinsically code up randomized algorithms and have long been at the heart of modern machine learning and approximate…
Linear mixed-effect models with two variance components are often used when variability comes from two sources. In genetics applications, variation in observed traits can be attributed to biological and environmental effects, and the…
Variable selection for Gaussian process models is often done using automatic relevance determination, which uses the inverse length-scale parameter of each input variable as a proxy for variable relevance. This implicitly determined…