Related papers: Anytime-Valid Inference for Multinomial Count Data
Motivated by monitoring the arrival of incoming adverse events such as customer support calls or crash reports from users exposed to an experimental product change, we consider sequential hypothesis testing of continuous-time inhomogeneous…
We investigate statistical inference across time scales. We take as toy model the estimation of the intensity of a discretely observed compound Poisson process with symmetric Bernoulli jumps. We have data at different time scales:…
In contemporary problems involving genetic or neuroimaging data, thousands of hypotheses need to be tested. Due to their high power, and finite sample guarantees on type-I error under weak assumptions, Monte Carlo permutation tests are…
We propose a novel continuous testing framework to test the intensities of Poisson Processes. This framework allows a rigorous definition of the complete testing procedure, from an infinite number of hypothesis to joint error rates. Our…
Linear models are foundational tools in statistics and ubiquitous across the applied sciences. However, conventional statistical inference -- such as $t$-tests and $F$-tests -- are only valid at fixed sample sizes, making them unsuitable…
A validated simulation model primarily requires performing an appropriate input analysis mainly by determining the behavior of real-world processes using probability distributions. In many practical cases, probability distributions of the…
Statistical hypothesis tests typically use prespecified sample sizes, yet data often arrive sequentially. Interim analyses invalidate classical error guarantees, while existing sequential methods require rigid testing preschedules or incur…
A validated simulation model primarily requires performing an appropriate input analysis mainly by determining the behavior of real-world processes using probability distributions. In many practical cases, probability distributions of the…
We present the first framework for Gaussian-process-modulated Poisson processes when the temporal data appear in the form of panel counts. Panel count data frequently arise when experimental subjects are observed only at discrete time…
App-based N-of-1 trials offer a scalable experimental design for assessing the effects of health interventions at an individual level. Their practical success depends on the strong motivation of participants, which, in turn, translates into…
Survival time is the primary endpoint of many randomized controlled trials, and a treatment effect is typically quantified by the hazard ratio under the assumption of proportional hazards. Awareness is increasing that in many settings this…
A/B tests are typically analyzed via frequentist p-values and confidence intervals; but these inferences are wholly unreliable if users endogenously choose samples sizes by *continuously monitoring* their tests. We define *always valid*…
We review approaches to statistical inference based on randomization. Permutation tests are treated as an important special case. Under a certain group invariance property, referred to as the ``randomization hypothesis,'' randomization…
Combined inference for heterogeneous high-dimensional data is critical in modern biology, where clinical and various kinds of molecular data may be available from a single study. Classical genetic association studies regress a single…
We study the problem of multiple hypothesis testing for multidimensional data when inter-correlations are present. The problem of multiple comparisons is common in many applications. When the data is multivariate and correlated, existing…
When variable selection methods are applied to bootstrapped and multiply imputed datasets, the set of selected variables typically varies across iterations. Aggregating results via the union rule can lead to overly dense models. We propose…
Many multiple testing procedures make use of the p-values from the individual pairs of hypothesis tests, and are valid if the p-value statistics are independent and uniformly distributed under the null hypotheses. However, it has recently…
We study online change point detection for multivariate inhomogeneous Poisson point process time series. This setting arises commonly in applications such as earthquake seismology, climate monitoring, and epidemic surveillance, yet remains…
We consider the problem of inference on the signs of $n>1$ parameters. We aim to provide $1-\alpha$ post-hoc confidence bounds on the number of positive and negative (or non-positive) parameters. The guarantee is simultaneous, for all…
This paper addresses the following general scenario: A scientist wishes to perform a battery of experiments, each generating a sequential stream of data, to investigate some phenomenon. The scientist would like to control the overall error…