统计方法学
Changepoint detection typically relies on a grid-search strategy for optimal data segmentation. When model fitting itself is expensive, repeatedly fitting a model on every candidate segment dominates the computation. Existing approaches…
Principal stratification is a framework for making sense of causal effects conditioned on variables that may themselves have been affected by the treatment. For instance, in an evaluation of an educational intervention, some subjects in the…
Approving and assessing new drugs is complex because multiple criteria must be considered simultaneously. A common approach is benefit-risk analysis, often conducted within a Bayesian framework to account for uncertainty and combine data…
Classification is a common statistical task in many areas. In order to ameliorate the performance of the existing methods, there are always some new classification procedures proposed. These procedures, especially those raised in the…
High-dimensional vector autoregressive (VAR) models provide a flexible framework for characterizing dynamic dependence in multivariate spatio-temporal systems, but their unrestricted estimation becomes infeasible when multiple variables are…
We study recursive maximum likelihood estimation for stochastic interacting particle systems based on continuous observation of a single particle. In this regime, consistent estimation of the finite-particle log-likelihood is not possible,…
We propose a computationally efficient inferential procedure for longitudinal function-on-function regression. The method follows a marginal three-step approach: (1) fit massive pointwise longitudinal scalar-on-function regression models,…
We propose a robust clustering framework for high-dimensional data with heavy tails and a large fraction of irrelevant variables. The method replaces the mean updates of Lloyd's $K$-means with \emph{spatial medians} to enhance robustness.…
Network interference occurs when a unit's outcome depends not only on its own treatment but also on the treatments received by connected units in the network. Experimental designs and analysis methods that ignore such interference can yield…
We develop a Fisher-consistent redescending robust estimator for the spatial scalar-on-function regression model, where a scalar response depends on both a functional predictor and a spatial autoregressive lag. Existing estimation…
Predictive Bayesian inference (PBI) represents a model-and prior-agnostic approach to standard Bayesian inference which allows users to quantify uncertainty for a functional of interest only by specifying a forward predictive model for…
Strong experimental papers in electrical and computer engineering and computer science (ECE/CS), especially in systems, networking, and applied machine learning, rest on more than a single impressive number. They rest on a chain of design,…
The hybrid approach to experimental design aims to control frequentist operating characteristics of Bayesian decision procedures. These operating characteristics are assessed by simulating sampling distributions of posterior summaries under…
We consider joint inversion for two or more unknown parameters from observational data in the Bayesian framework. Standard approaches often either treat the parameters as independent or impose structural similarity through regularisation…
Person-fit statistics are widely used to detect aberrant response patterns in educational and psychological measurement. Snijders (2001) suggested an asymptotically correct standardization for a broad class of such statistics. This paper…
We consider a generalization of the variance-gamma (generalized asymmetric Laplace) distribution, defined as a normal mean - variance mixture with a gamma mixing distribution. While this model is typically studied in the univariate setting,…
Cluster randomized trials are widely used when individual randomization is logistically infeasible or when correlations between observations cannot be ignored, especially in fields such as ophthalmology, infectious disease, vaccine…
Analysis often splits change into components. For example, how much of the observed variance is caused by genes or environment? In many cases, the split is ultimately made by the logic of the chain rule, which divides the difference of a…
Many research questions -- particularly those in environmental health -- do not involve binary exposures. In environmental epidemiology, this includes multivariate exposure mixtures with nondiscrete components. Causal inference estimands…
Typically, trials investigate the impact of either an individual-level intervention on participant outcomes, or the impact of a cluster-level intervention on participant outcomes. Factorial designs consider two (or more) treatments for each…