统计方法学
This extended preface [to the Book `Bayesian Nonparametrics', Cambridge University Press, 2010, by NL Hjort, CC Holmes, P Mueller, SG Walker] is meant to explain why you are right to be curious about Bayesian nonparametrics -- why you may…
We propose a computationally simple framework for clustering functional data based on Gaussian-process-generated random projections. In this approach, each curve is first projected onto a large collection of independent Gaussian process…
Epilepsy is a neurological disorder characterized by recurrent seizures affecting more than 70 million people worldwide. Often, an individual with epilepsy is more likely to experience subsequent seizures following an initial seizure, a…
We develop a Hilbert--Schmidt independence criterion (HSIC)-based framework for testing serial independence in strictly stationary time series. The proposed auto Hilbert--Schmidt independence criterion (AutoHSIC) measures dependence between…
Conformal methods provide prediction sets for outcomes with confidence guarantees. We study their use in a selective inference setting, where inference is performed only when the prediction set is informative. The analyst may consider as…
This article introduces estimators of trend and seasonality for time series of point processes. We assume the point processes follow a temporal or spatial doubly-stochastic Poisson model with log-Gaussian intensity functions. The proposed…
Testing the equality of two high-dimensional mean vectors is a fundamental problem in multivariate analysis. While the classical Hotelling's $T^2$ test is optimal in low-dimensional settings, it fails when the dimension $p$ is comparable to…
Causal discovery from multivariate time series is challenging when causal effects may occur both across time and within the same sampling interval. This issue is especially important in applications such as neuroscience, where the sampling…
The test-negative design (TND) is a resource-efficient observational study design that can assess vaccine effectiveness and exposure-proximal immune correlates of disease. The TND enrolls symptomatic individuals seeking diagnostic testing…
We describe a Bayesian multidimensional explanatory IRT model, and an associated Markov Chain Monte Carlo (MCMC) estimation procedure and the corresponding development of calibration software, designed for psychometric analyses of large…
Modern biomedical survival studies with high-dimensional genomic and clinical predictors are challenged by missing covariates. Existing methods conduct inference through penalization and debiasing when the number of covariates diverges with…
Recent research has led to the development of MCMC algorithms with likelihood-informed proposals when targeting posterior distributions supported on discrete state spaces. Our work is placed within this field and puts forward a new MCMC…
Generalized partially linear single-index additive models (GPLSIAMs) have been increasingly applied across diverse areas due to their versatility in integrating functional flexibility with parametric dimension reduction while maintaining…
We study distribution-free root cause analysis in multi-stream data, where an evolving underlying system is observed through multiple data streams that may each undergo distributional changes at unknown timepoints. In such settings, the…
Dennis Lindley once said that there is only one thing worse than a frequentist, and that is an empirical Bayesian. The quip has the air of caricature, but its technical content is serious: empirical Bayes uses the same data twice, conflates…
We introduce Pairwise Distance-Diffusion Analysis (PDDA), a geometric framework for estimating the Hurst exponent from distance plots of long-memory stochastic processes. A single construction yields two complementary routes: R/S-PDDA, a…
We develop an adaptive jump test for discretely observed high-frequency semimartingales by combining the A"it-Sahalia--Jacod ratio statistic (A"it-Sahalia and Jacod, 2009) and the Lee--Mykland extreme-return statistic (Lee and Mykland,…
Informative cluster size (ICS) and informative subgroup size (ISS) can distort marginal association estimates when the number of observed units, or their distribution across outcome-defined categories, is related to the outcomes under…
Multidimensional factor models with moderations on all model parameters have so far been limited to single-factor and two-factor models. This does not align well with existing psychological measures, which are commonly intended to assess…
Conformalized multiple testing offers a model-free way to control predictive uncertainty in decision-making. Existing methods typically use only part of the available data to build score functions tailored to specific settings. We propose a…