Related papers: Assessing Inference Methods
Design-based simulations - procedures that hold realized outcomes fixed and generate variation by resampling treatment assignment or shocks - are widely used in both methodological and applied work to assess inference procedures. This paper…
We consider the problem of inference in shift-share research designs. The choice between existing approaches that allow for unrestricted spatial correlation involves tradeoffs, varying in terms of their validity when there are relatively…
Comparative simulation studies are workhorse tools for benchmarking statistical methods. As with other empirical studies, the success of simulation studies hinges on the quality of their design, execution and reporting. If not conducted…
Simulations are valuable tools for empirically evaluating the properties of statistical methods and are primarily employed in methodological research to draw general conclusions about methods. In addition, they can often be useful to…
When interpreting A/B tests, we typically focus only on the statistically significant results and take them by face value. This practice, termed post-selection inference in the statistical literature, may negatively affect both point…
In many applications, different populations are compared using data that are sampled in a biased manner. Under sampling biases, standard methods that estimate the difference between the population means yield unreliable inferences. Here we…
Recent advancements in language models have led to significant improvements in mathematical reasoning across various benchmarks. However, most of these benchmarks rely on automatic evaluation methods that only compare final answers using…
In this paper, we investigate the problem of assessing statistical methods and effectively summarizing results from simulations. Specifically, we consider problems of the type where multiple methods are compared on a reasonably large test…
Matched case-control studies are commonly employed in epidemiological research for their convenience and efficiency. Analysis of secondary outcomes can yield valuable insights into biological pathways and help identify genetic variants of…
Recent research has generated hope that inference scaling, such as resampling solutions until they pass verifiers like unit tests, could allow weaker models to match stronger ones. Beyond inference, this approach also enables training…
Method comparisons are essential to provide recommendations and guidance for applied researchers, who often have to choose from a plethora of available approaches. While many comparisons exist in the literature, these are often not neutral…
Simulation can enable the study of recommender system (RS) evolution while circumventing many of the issues of empirical longitudinal studies; simulations are comparatively easier to implement, are highly controlled, and pose no ethical…
We provide an approach to exploratory data analysis in matched observational studies with a single intervention and multiple endpoints. In such settings, the researcher would like to explore evidence for actual treatment effects among these…
Statistical inferential results generally come with a measure of reliability for decision-making purposes. For a policy implementer, the value of implementing published policy research depends critically upon this reliability. For a policy…
When teaching and discussing statistical assumptions, our focus is oftentimes placed on how to test and address potential violations rather than the effects of violating assumptions on the estimates produced by our statistical models. The…
Applied researchers in biomedicine and related fields are often interested in estimating the causal effect of a treatment or intervention. Although randomized clinical trials are considered the gold standard for establishing causal effects,…
Selective inference aims at providing valid inference after a data-driven selection of models or hypotheses. It is essential to avoid overconfident results and replicability issues. While significant advances have been made in this area for…
Simulation-based inference has been popular for amortized Bayesian computation. It is typical to have more than one posterior approximation, from different inference algorithms, different architectures, or simply the randomness of…
Simulation methods are among the most ubiquitous methodological tools in statistical science. In particular, statisticians often is simulation to explore properties of statistical functionals in models for which developed statistical theory…
When deployed in the real world, machine learning models inevitably encounter changes in the data distribution, and certain -- but not all -- distribution shifts could result in significant performance degradation. In practice, it may make…