统计方法学
Changepoint detection identifies times when the generative process of a time series changes, with applications in healthcare, cybersecurity, and finance. In multivariate settings, changes in cross-variable and temporal dependence are…
A growing number of scholars seek to estimate causal effects of unstructured data such as text, images, and video. However, existing methods typically treat each object as a single, static observation. We develop a statistical framework for…
We study an optimal threshold functional arising in binary classification for continuous biomarkers. While the ROC curve summarizes discriminatory performance across all thresholds, practical threshold selection must also account for…
Recently, a new testing approach for response-adaptive clinical trials was proposed based on the allocation probabilities (AP) rather than the outcome data. While original work on the AP test focused on binary and normal endpoints and…
Tensor regression is an important tool for tensor data analysis, but existing works have not considered the impact of outliers, making them potentially sensitive to such data points. This paper proposes a low tubal rank robust regression…
This paper presents a unified framework for sufficient dimension reduction (SDR) that generalizes several existing SDR techniques and offers new insights into the connection between inverse conditional moment independence and dimension…
Clinical prediction models must be developed using sufficiently large datasets to minimise overfitting and ensure robust predictive performance. Existing sample size calculations assume complete predictor data for all included participants,…
We propose a hidden Markov model for univariate proportion time series taking values in (0,1), where regime switching captures latent structural changes and the emission distribution belongs to the Beta family. In each latent state, the…
This study proposes a mixture cure model that latently divides a population based on event occurrence within a finite time horizon. Conventional models rely on event occurrence over an infinite horizon, introducing untestable assumptions…
We consider the problem of estimating the underlying edge probabilities of a time-varying network observed at multiple time points. The probability structure is represented by a time-varying graphon that satisfies temporal H\"older…
Estimating time-varying correlation matrices is challenging because existing methods may adapt slowly to structural changes, impose insufficient regularization, or produce diffuse posterior uncertainty. In moderate dimensions, an additional…
As learning systems increasingly shape everyday decisions, Algorithmic Collective Action (ACA), i.e., users coordinating changes to shared data to steer model behavior, offers a complement to regulator-side policy and corporate model…
Social contact matrices are essential tools in infectious disease epidemiology as they quantify close-range human contact patterns which directly drive the transmission of airborne infectious diseases. In this work we propose a Bayesian…
Variable-length Markov chains (VLMCs) are a flexible class of higher-order Markov models that admit a natural representation as context trees. Existing Bayesian methods for specifying prior distributions on tree structures rely on branching…
The survey experiment is widely used in economics and social sciences to evaluate the effects of treatments or programs. In a standard population-based survey experiment, the experimenter randomly draws experimental units from a target…
We present a method of constructing statistical intervals that obtain a natural middle ground between Bayesian and frequentist statistical intervals, previously unexplored in literature: To a p% Bayesian credible interval we should assign a…
An important class of spatio-temporal models is constructed by leveraging the hierarchical structure of dynamical (or, state-space) models. This paper proposes a new statistical dynamical model for spatio-temporal processes motivated by…
Model uncertainty is a central challenge in statistical models for binary outcomes such as logistic regression, arising when it is unclear which predictors should be included in the model. Many methods have been proposed to address this…
Instrumental variables (eliminate the bias that afflicts least-squares identification of dynamical systems through noisy data, yet traditionally relies on external instruments that are seldom available for nonlinear time series data. We…
We provide an inferential framework to assess variable importance for heterogeneous treatment effects. This assessment is especially useful in high-risk domains such as medicine, where decision makers hesitate to rely on black-box treatment…