统计方法学
We consider the problem of learning how to optimally allocate treatments whose cost is uncertain and can vary with pre-treatment covariates. This setting may arise in medicine if we need to prioritize access to a scarce resource that…
Many countries measure poverty based only on income or consumption. However, there is a growing awareness of measuring poverty through multiple dimensions that captures a more reasonable status of poverty. Estimating poverty measure(s) for…
In recent years, there has been growing research interest in addressing treatment hierarchy questions within network meta-analysis (NMA). In NMAs involving many treatments, the number of possible hierarchy questions becomes prohibitively…
In many scientific domains, clustering aims to reveal interpretable latent structure that reflects relevant subpopulations or processes. Widely used Bayesian mixture models for model-based clustering often produce overlapping or redundant…
Many social science questions ask how linguistic properties causally affect an audience's attitudes and behaviors. Because text properties are often interlinked (e.g., angry reviews use profane language), we must control for possible latent…
This study provides essential insights into how diffusion processes unfold in complex networks, with a focus on cryptocurrency blockchains and infrastructure networks. The structural properties of these networks, such as hub-dominated,…
We propose a general CoVaR framework that extends the traditional CoVaR by incorporating diverse expert views and information, such as asset moment characteristics, quantile insights, and perspectives on the relative loss distribution…
A B testing serves as the gold standard for large scale, data driven decision making in online businesses. To mitigate metric variability and enhance testing sensitivity, control variates and regression adjustment have emerged as prominent…
The study of associations and their causal explanations is a central research activity whose methodology varies tremendously across fields. Even within specialized subfields, comparisons across textbooks and journals reveals that the basics…
Background: Advanced adaptive randomised clinical trials are increasingly used. Compared to their conventional counterparts, their flexibility may make them more efficient, increase the probability of obtaining conclusive results without…
Small area estimation using survey data can be achieved by using either a design-based or a model-based inferential approach. Design-based direct estimators are generally preferable because of their consistency, asymptotic normality, and…
Stochastic dominance has not been too employed in practice due to its important limitations. To increase its versatility, the concept has recently been adapted by introducing various indices that measure the degree to which one probability…
Cluster-randomized trials (CRTs) are experimental designs where groups or clusters of participants, rather than the individual participants themselves, are randomized to intervention groups. Analyzing CRT requires distinguishing between…
Micro-randomized trials (MRTs) are increasingly used to evaluate mobile health interventions with binary proximal outcomes. Standard inverse probability weighting (IPW) estimators are unbiased but unstable in small samples or under extreme…
Exposure to diverse non-genetic factors, known as the exposome, is a critical determinant of health outcomes. However, analyzing the exposome presents significant methodological challenges, including: high collinearity among exposures, the…
By allowing the effects of $p$ covariates in a linear regression model to vary as functions of $R$ additional effect modifiers, varying-coefficient models (VCMs) strike a compelling balance between interpretable-but-rigid parametric models…
We propose a test for a change in the mean for a sequence of functional observations that are only partially observed on subsets of the domain, with no information available on the complement. The framework accommodates important scenarios,…
Clustered data arise naturally in many scientific and applied research settings where units are grouped within clusters. They are commonly analyzed using linear mixed models to account for within-cluster correlations. This article focuses…
Compositional data, representing proportions constrained to the simplex, arise in diverse fields such as geosciences, ecology, genomics, and microbiome research. Existing nonparametric density estimation methods often rely on…
Smart surveys are surveys that make use of sensors and machine intelligence to reduce respondent burden and increase data quality. Smart surveys have been tests as a way to improve diary surveys in official statistics, where data are…