统计方法学
High-dimensional data arise routinely in modern statistics, econometrics, finance, genomics, and machine learning. While a large body of existing methodology is developed under Gaussian or light-tailed assumptions, many real data sets…
This paper studies alpha testing in a high-dimensional conditional time-varying factor model with temporally dependent observations. Both factor loadings and alpha processes are allowed to vary smoothly over time, and the cross-sectional…
We study adaptive pooling under predictive heterogeneity in high-dimensional multivariate time series forecasting, where global models improve statistical efficiency but may fail to capture heterogeneous predictive structure, while naive…
Planning empirical experiments such as clinical trials or A/B tests requires sample size determination, which in many interesting cases has no closed-form solution (e.g. factorial or adaptive designs). adsasi is a new R package that enables…
This article introduces new methods for the analysis of cyclostationary time series with infinite variance. Traditional cyclostationary analysis, based on periodically correlated (PC) processes, relies on the autocovariance function (ACVF).…
Cluster-randomized trials (CRTs) are widely used to evaluate interventions delivered at the clinic, practice, or community level. Although standard analyses typically target average treatment effects, such summaries mask potentially…
Bayesian nonparametric mixture models provide a flexible framework for data analysis but are often hindered by the computational expense of traditional inference methods like MCMC. A fast, recursive algorithm proposed by Newton (2002)…
Refined vaccine regimens containing variant-matched inserts are often authorized based on historical phase 3 efficacy trials together with immunobridging studies. Phase 3 trials are essential for establishing immune biomarkers that reliably…
Evaluating the influence of continuous covariates, like exposure time or dose, on a response variable is a pivotal objective in the assessment of a compound's effect, particularly when determining toxicity in pre-clinical research or…
We propose a family of association measures for two-way contingency tables whose latent distribution can be assumed to be bivariate normal. When this assumption holds, the power-divergence measuring departure from independence can be…
Mediation analysis is a useful tool to evaluate surrogate endpoints in clinical trials. We propose a novel method, the M-survival learner, for estimating heterogeneous indirect treatment effects in the presence of censored outcomes. The…
Generalizing treatment effects from a randomized trial to a target population requires the assumption that potential outcome distributions are invariant across populations after conditioning on observed covariates. This assumption fails…
Cognitive diagnosis models (CDMs) are restricted latent class models widely used to measure attributes of interest in diagnostic assessments across education, psychology, biomedical sciences, and related fields. Partial-mastery CDMs…
Bayesian optimization (BO) is a powerful framework for estimating parameters of expensive simulation models, particularly in settings where the likelihood is intractable and evaluations are costly. In stochastic models every simulation is…
Many weak instrumental variables (IVs) are routinely used in the health and social sciences to improve identification and inference of the treatment effect of interest, along with a broad collection of data on potential confounding factors…
Partial correlation coefficients are widely applied in the social sciences to evaluate the relationship between two variables after accounting for the influence of others. In this article, we present Bayes Factor Functions (BFFs) for…
Treatment switching is a common occurrence in the management of Multiple Sclerosis (MS), where patients transition across various disease-modifying therapies (DMTs) due to heterogeneous treatment responses, differences in disease…
In the context of a binary classification problem, the optimal linear combination of continuous predictors can be estimated by maximizing an empirical estimate of the area under the receiver operating characteristic (ROC) curve (AUC). For…
Results from multiple diagnostic tests are usually combined to improve the overall diagnostic accuracy. For binary classification, maximization of the empirical estimate of the area under the receiver operating characteristic (ROC) curve is…
Interference arises when the treatment assigned to one individual affects the outcomes of other individuals. Commonly, individuals are naturally grouped into clusters, and interference occurs only among individuals within the same cluster,…