Related papers: Simultaneous directional inference
Inferring the causal direction between two variables from their observation data is one of the most fundamental and challenging topics in data science. A causal direction inference algorithm maps the observation data into a binary value…
An assumption often made in supervised learning is that the training and testing sets have the same label distribution. However, in real-life scenarios, this assumption rarely holds. For example, medical diagnosis result distributions…
We investigate the problem of jointly testing multiple hypotheses and estimating a random parameter of the underlying distribution in a sequential setup. The aim is to jointly infer the true hypothesis and the true parameter while using on…
We introduce $\textit{Backward Conformal Prediction}$, a method that guarantees conformal coverage while providing flexible control over the size of prediction sets. Unlike standard conformal prediction, which fixes the coverage level and…
A number of biomedical problems require performing many hypothesis tests, with an attendant need to apply stringent thresholds. Often the data take the form of a series of predictor vectors, each of which must be compared with a single…
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset…
We consider parameter estimation, hypothesis testing and variable selection for partially time-varying coefficient models. Our asymptotic theory has the useful feature that it can allow dependent, nonstationary error and covariate…
Score tests have the advantage of requiring estimation alone of the model restricted by the null hypothesis, which often is much simpler than models defined under the alternative hypothesis. This is typically so when the alternative…
In this paper, we develop a simple approach for testing multiple statistical hypotheses based on the observations of a number of probability ratios enumerated consecutively with respect to the index of hypotheses. Explicit and tight bounds…
Estimating how well a machine learning model performs during inference is critical in a variety of scenarios (for example, to quantify uncertainty, or to choose from a library of available models). However, the standard accuracy estimate of…
We consider the classical sequential binary hypothesis testing problem in which there are two hypotheses governed respectively by distributions $P_0$ and $P_1$ and we would like to decide which hypothesis is true using a sequential test. It…
A fundamental assumption of classical hypothesis testing is that the significance threshold $\alpha$ is chosen independently from the data. The validity of confidence intervals likewise relies on choosing $\alpha$ beforehand. We point out…
Introduction: there is an ongoing debate about directional inference of two-sided hypothesis tests for which some authors argue that rejecting $\theta = \theta_0$ does not allow to conclude that $\theta > \theta_0$ or $\theta < \theta_0$…
Modern statisticians are often presented with hundreds or thousands of hypothesis testing problems to evaluate at the same time, generated from new scientific technologies such as microarrays, medical and satellite imaging devices, or flow…
Statistical inference of the high-dimensional regression coefficients is challenging because the uncertainty introduced by the model selection procedure is hard to account for. A critical question remains unsettled; that is, is it possible…
How should researchers analyze randomized experiments in which the main outcome is latent and measured in multiple ways but each measure contains some degree of error? We first identify a critical study-specific noncomparability problem in…
Multi-arm bandit experimental designs are increasingly being adopted over standard randomized trials due to their potential to improve outcomes for study participants, enable faster identification of the best-performing options, and/or…
In prior work we have introduced an asymptotic threshold of sufficient randomness for causal inference from observational data. In this paper we extend that prior work in three main ways. First, we show how to empirically estimate a lower…
In typical high dimensional statistical inference problems, confidence intervals and hypothesis tests are performed for a low dimensional subset of model parameters under the assumption that the parameters of interest are unconstrained.…
We consider the problem of assessing the importance of multiple variables or factors from a dataset when side information is available. In principle, using side information can allow the statistician to pay attention to variables with a…