Related papers: Simultaneous directional inference
While traditional multiple testing procedures prohibit adaptive analysis choices made by users, Goeman and Solari (2011) proposed a simultaneous inference framework that allows users such flexibility while preserving high-probability bounds…
In traditional hypothesis testing one must pre-specify the significance level $\alpha$ to bound the `size' of the test: its probability to falsely reject the hypothesis. Indeed, a data-dependent selection of $\alpha$ would generally distort…
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset…
It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides…
In machine learning, the selection of a promising model from a potentially large number of competing models and the assessment of its generalization performance are critical tasks that need careful consideration. Typically, model selection…
In this paper, we propose a general method for testing composite hypotheses. Our idea is to use confidence limits to define stopping and decision rules. The requirements of operating characteristic function can be satisfied by adjusting the…
In a high dimensional multiple testing framework, we present new confidence bounds on the false positives contained in subsets S of selected null hypotheses. The coverage probability holds simultaneously over all subsets S, which means that…
When testing many hypotheses, often we do not have strong expectations about the directions of the effects. In some situations however, the alternative hypotheses are that the parameters lie in a certain direction or interval, and it is in…
The problem of simultaneously testing the marginal distributions of sequentially monitored, independent data streams is considered. The decisions for the various testing problems can be made at different times, using data from all streams,…
This paper focuses on parameter estimation and introduces a new method for lower bounding the Bayesian risk. The method allows for the use of virtually \emph{any} information measure, including R\'enyi's $\alpha$, $\varphi$-Divergences, and…
Suppose one has a collection of parameters indexed by a (possibly infinite dimensional) set. Given data generated from some distribution, the objective is to estimate the maximal parameter in this collection evaluated at this distribution.…
Machine learning applications frequently come with multiple diverse objectives and constraints that can change over time. Accordingly, trained models can be tuned with sets of hyper-parameters that affect their predictive behavior (e.g.,…
Negative control is a common technique in scientific investigations and broadly refers to the situation where a null effect (''negative result'') is expected. Motivated by a real proteomic dataset, we will present three promising and…
Tuning parameters are parameters involved in an estimating procedure for the purpose of reducing the risk of some other estimator. Examples include the degree of penalization in penalized regression and likelihood problems, as well as the…
Confidence intervals (CIs) are instrumental in statistical analysis, providing a range estimate of the parameters. In modern statistics, selective inference is common, where only certain parameters are highlighted. However, this selective…
Difference-in-differences (DiD) identification relies mainly on a parallel trends assumption about untreated potential outcomes. Researchers often relax this assumption by assuming conditional parallel trends within units with the same…
We derive inferential procedures for large sample sizes that remain valid under data-dependent significance levels (so-called "post-hoc valid inference"). Classical statistical tools require that the significance level -- the "type-I error"…
We review the methods of constructing confidence intervals that account for a priori information about one-sided constraints on the parameter being estimated. We show that the so-called method of sensitivity limit yields a correct solution…
In this paper, our interest is in the problem of simultaneous hypothesis testing when the test statistics corresponding to the individual hypotheses are possibly correlated. Specifically, we consider the case when the test statistics…
This paper describes three methods for carrying out non-asymptotic inference on partially identified parameters that are solutions to a class of optimization problems. Applications in which the optimization problems arise include estimation…