统计方法学
Cancer subtyping plays a crucial role in informing prognosis and guiding personalized treatment strategies. However, conventional subtyping approaches often rely on static, biopsy-derived scores that hardly capture the biological…
For a fixed linear-model basis, we show that the $A$ criterion factors into an inverse-$D$ scale term and a dimensionless sphericity factor that depends only on eigenvalue dispersion. This factor isolates exactly the part of $A$ not…
Conformal prediction gives exact finite-sample coverage guarantees under exchangeability, but deployed systems are judged by more than coverage alone. For a fixed calibrated rule reused over a finite operational window, stakeholders also…
Transfer learning (TL) has emerged as a powerful tool for improving estimation and prediction performance by leveraging information from related datasets, with the offset TL (O-TL) being a prevailing implementation. In this paper, we adapt…
Vaccine randomized trials are typically designed to be blinded, ensuring that the estimated vaccine efficacy (VE) reflects the immunological effect of the vaccine. When blinding is broken, however, the estimated VE reflects not only the…
Intercurrent events, such as treatment switching, rescue medication, dropout, or truncation by death, frequently complicate intention-to-treat analyses in randomized clinical trials. Existing causal inference frameworks typically target…
In plant breeding and variety testing, there is an increasing interest in making use of environmental information to enhance predictions for new environments. Here, we will review linear mixed models that have been proposed for this…
Causal inference relies on the untestable assumption of no unmeasured confounding. Sensitivity analysis can be used to quantify the impact of unmeasured confounding on causal estimates. Among sensitivity analysis methods proposed in the…
Background: Adaptive interventions provide a guide for using ongoing information about individuals to decide whether and how to modify the type, amount, delivery modality, or timing of treatment, to improve intervention effectiveness while…
Complex and larger networks are becoming increasingly prevalent in scientific applications in various domains. Although a number of models and methods exist for such networks, cross-validation on networks remains challenging due to the…
In a multi-fidelity setting, data are available from two sources, high- and low-fidelity. Low-fidelity data has larger size and can be leveraged to make more efficient inference about quantities of interest, e.g. the mean, for high-fidelity…
There is a fast-growing literature on estimating optimal treatment rules directly by maximizing the expected outcome. In biomedical studies and operations applications, censored survival outcome is frequently observed, in which case the…
Case-I interval-censored (current status) data from multistate systems are often encountered in biomedical and epidemiological studies. In this article, we focus on the problem of estimating state entry distribution and occupation…
Adaptive enrichment trials aim to identify and recruit participants most likely to benefit from treatment based on evolving biomarker evidence, with the goal of informing individualized treatment recommendations. Bayesian methods are well…
Anomaly detection methods are widely used but often rely on ad hoc rules or strong assumptions, and they often focus on tail events, missing ``inlier'' anomalies that occur in low-density gaps between modes. We propose a unified framework…
Unnormalized (or energy-based) models provide a flexible framework for capturing the characteristics of data with complex dependency structures. However, the application of standard Bayesian inference methods has been severely limited…
Accurate on-orbit reliability prediction for satellite electronics is often hindered by limited data availability, varying operational conditions, and considerable unit-to-unit variability. To overcome these obstacles, this paper proposes a…
Statistical analysis of agricultural experiments is based on structured experimental designs such as randomized block, factorial, split-plot, and multi-environment trials. While the theoretical bases of these approaches are sound, their…
Extreme weather poses a large risk to critical energy systems (Ekisheva, Rieder, Norris, Lauby, & Dobson 2021; Levin, Botterud, Mann, Kwon, & Zhou 2022). Uncertainty quantification of negative impacts is important for developing resilience,…
We study estimation and inference for heterogeneous principal causal effects with binary treatments and binary intermediate variables. Principal causal effects are subgroup effects within strata defined by potential values of an…