统计方法学
We present the results of a large number of simulation studies regarding the power of various goodness-of-fit as well as non-parametric two-sample tests for multivariate data. In two dimensions this includes both continuous and discrete…
High-dimensional health and surveillance studies often involve many collinear predictors, multiple correlated outcomes of different types, and latent heterogeneity across observational units. We propose a Bayesian latent-cluster…
Attributing an observed outcome to its root cause is a central task in domains ranging from medical diagnosis to engineering fault diagnosis. Existing approaches either equate the root cause with a root node of the causal graph, as in…
Studies of HPV vaccine efficacy usually record infections with vaccine targeted and nontargeted strains. Contrary to blinded randomized controlled trials, confounding bias can be a threat and risk compensation may occur in observational…
Traditional analysis of marked spatial point processes often relies on global summary statistics, which tend to obscure local spatial heterogeneity by averaging dependencies across the entire observation window. To overcome this limitation,…
Plant breeding programs use data obtained from multi-environment selection experiments to produce improved varieties with the ultimate aim of maintaining high levels of genetic gain. Selection accuracy can be improved with the use of…
Causal graphs may inform covariate adjustment for estimating causal effects and improve estimation efficiency by exploiting the graphical structure. In many applications, however, the target causal parameter may not be point-identified due…
In causal inference with ordinal outcomes, several interpretable estimands are functions of the probability that the potential outcome under one treatment is larger than that under another treatment for the same unit. This probability…
Many real-world networks exhibit hierarchical, tree-like structure and heavy-tailed degree distributions, phenomena not readily captured by standard statistical models for network data. Extensions of the popular continuous latent space…
We develop a unified operator framework for scalar, multivariate, and functional regression based on integral operators defined with respect to general measures. Within this framework, classical regression models, including…
Background: External validation is essential for assessing the transportability of predictive models. However, its interpretation is often confounded by differences between external and development populations. This study introduces a…
This paper challenges the prevailing practice of accepting standardized factor loadings as low as .50 in confirmatory factor analysis. Drawing on the logic of Average Variance Extracted (AVE) and communality, the author argues for a…
We propose a new constrained EM algorithm that is applicable to general constrained estimation problems. The proposed method is based on a novel framework, the `dual-homotopy framework,' which combines deterministic annealing EM with a…
We introduce the Multiplicative Quasi-Instrumental Variable (MQIV) model, a framework for causal inference with unmeasured confounding that leverages an instrument that may be imperfectly exogenous. We allow the candidate quasi-instrument…
Achieving valid conditional coverage in conformal prediction is challenging due to the theoretical difficulty of satisfying pointwise constraints in finite samples. Building upon the characterization of conditional coverage through marginal…
Network Meta-Analysis (NMA) is an increasingly popular evidence synthesis tool that can provide a ranking of competing treatments, also known as a treatment hierarchy. Treatment-Covariate Interactions (TCIs) can be included in NMA models to…
Quasi-experimental causal inference methods have become central in empirical operations management for guiding managerial decisions. Among these, empiricists utilize the Difference-in-Differences (DiD) estimator, which relies on the…
Parameter estimation and inference from complex survey samples typically focuses on global model parameters whose estimators have asymptotic properties, such as from fixed effects regression models. The central challenge is to both mitigate…
Large-scale assessment data typically include numerous categorical variables, often affected by missing values. Motivated by the challenges arising in this framework, we extend the knockoffs method for selecting predictors to settings with…
The instrumental variable (IV) design is a common approach to address hidden confounding bias. For validity, an IV must impact the outcome only through its association with the treatment. In addition, IV identification has required a…