统计方法学
Evaluating treatment effect heterogeneity across patient subgroups is a fundamental aspect of clinical trial analysis. Yet, these analyses have inherent limitations due to small sample sizes and the substantial number of subgroups…
High-dimensional biomedical studies require models that are simultaneously accurate, sparse, and interpretable, yet exact best subset selection for generalized linear models is computationally intractable. We develop a scalable method that…
This work develops a multivariate extension of the Fixed Rank Kriging (FRK) framework for spatial prediction in settings where multiple spatial processes may provide complementary information. The goal is to preserve the computational…
Ordinary differential equation (ODE) models are widely used to describe systems in many areas of science. To ensure these models provide accurate and interpretable representations of real-world dynamics, it is often necessary to infer…
After the seminal Benjamini-Hochberg (BH) procedure for controlling the false discovery rate (FDR) was proposed, dozens of papers have attempted to improve its power by adapting to the unknown proportion of nulls. We observe that most null…
Current experimental design techniques for dynamical systems often only incorporate measurement noise, while dynamical systems also involve process noise. To construct experimental designs we need to quantify their information content. The…
The coordinate-exchange algorithm is commonly used to construct optimal experimental designs. Every execution of the coordinate-exchange algorithm produces a new, seemingly random, order of the selected design points. In this short…
We propose an information criterion for determining an unknown number of periodic components in functional time series. Identifying the number of frequencies in large-scale time series has been a central focus. To achieve this goal, we…
Structural and functional neuroimaging modalities provide complementary windows into brain organization: structural imaging characterizes neural tissue anatomy and microstructure, while functional imaging captures dynamic patterns of neural…
Brain encoding and decoding aims to understand the relationship between external stimuli and brain activities, and is a fundamental problem in neuroscience. In this article, we study latent embedding alignment for brain encoding and…
Nonstationary high-dimensional time series are increasingly encountered in biomedical research as measurement technologies advance. Owing to the homeostatic nature of physiological systems, such datasets are often located on, or can be well…
Evaluating treatment effects is critical in clinical trials but sometimes involves lengthy, invasive, or costly follow-up procedures. In these cases, surrogate markers, which provide intermediate measures of the long-term treatment effect,…
Large-scale longitudinal molecular profiling is now firmly established in biomedical research, prompted by the need to uncover coordinated biomarker trajectories reflecting the dynamics of underlying biological mechanisms and characterise…
We propose a nonparametric test of spatial independence for data observed on irregular, non-lattice point clouds $\mathcal{V}_{n}\subset\mathbb{R}^{2}$. For each location $v\in\mathcal{V}_{n}$, we encode the local spatial configuration…
Calibration weighting is a fundamental technique in survey sampling and data integration for incorporating auxiliary information and improving efficiency of estimators. Classical calibration methods are typically formulated through distance…
For two time series $\{ (Y_t, Z_t^Y) \}_{t}$ and $\{(X_t, Z_t^X)\}_{t}$, the directional dependence of $\{ X_t \}_{t}$ on $\{ Y_t \}_{t}$ while removing the impact of $Z_t^X$ on $X_t$ and the impact of $Z_t^Y$ on $ Y_t$ can be measured by…
Recent work has developed a non-parametric Bayesian approach to the calibration of a computer model, which abstractly amounts to the inversion of a pushforward of stochastic input parameters by a smooth map. The framework has been used in…
The governing equations of stochastic dynamical systems often become cost-prohibitive for numerical simulation at large scales. Surrogate models of the governing equations, learned from data of the high-fidelity system, are routinely used…
In pharmaceutical and toxicological research, historical control data are increasingly used to validate concurrent control groups, typically via the construction of historical control limits. While methods have been described for continuous…
This papers presents a generalization of the Weitzman overlapping coefficient, originally defined for two probability density functions, to a setting involving k independent distributions, denoted by Delta. To estimate this generalized…