统计方法学
We extend generalized functional linear models under independence to a situation in which a functional covariate is related to a scalar response variable that exhibits spatial dependence-a complex yet prevalent phenomenon. For estimation,…
This manuscript bridges nonparametric smoothness-based and shape-restricted estimation, which may appear as two disjoint paradigms in the field. The proposed approach is motivated by a conceptually simple observation: every Lipschitz…
Modern experimental designs often face the so-called treatment cardinality constraint, which is the constraint on the number of included factors in each treatment. Experiments with such constraints are commonly encountered in engineering…
Summaries of craters on terrestrial bodies, such as the number and size distribution, are essential for understanding the history of the Solar System. Identifying craters, however, has not been automated and thus relies on expert…
A variational inference-based framework for training a multi-output Gaussian process latent variable model, specifically tailored to the tails-up spatio-temporal stream network, is developed. Training, given a censored observational data…
In high-throughput biology, it is common to fit thousands of linear regressions -- one per gene, protein, or other unit -- with very few samples per unit. Limma-trend, one of the most widely used methods in this setting, improves power by…
We introduce a continuous-time Markov chain framework for estimating population size from multi-list data, which allows directional interactions to be modelled and can accommodate absorbing lists, such as death records, or more general data…
Laplace approximations are a standard tool for computationally efficient inference in latent Gaussian models, but they fail for quantile regression with the asymmetric Laplace likelihood because the observed Hessian vanishes almost…
Missing data is pervasive in many scientific domains such as public health, environmental science, and the social sciences. Recoverability from missing data is typically studied using fully specified variable-level missingness models…
The intersection set of Bayesian and nonparametric statistics was almost empty until about 1973, but now is growing at a healthy rate. This chapter, for the {\it Highly Structured Stochastic Systems} book (Oxford University Press, 2003)…
This paper proposes a novel, nonparametric, interpoint distance-based measure to investigate whether there exist any groups in a set of given data, and if so then, how many groups are prevailing in total. It is a cluster accuracy index…
Modern applications of conformal inference to multiple testing problems, such as outlier detection and candidate selection, often involve selecting test samples whose conformal p-values fall below a threshold. The quality of such methods is…
Conditional average treatment effects (CATEs) are increasingly estimated from observational data and used to guide policy and individualized treatment decisions. Before such estimates can be trusted in practice, their predictive fitness…
During an infectious disease outbreak, providing accurate answers to policy questions about transmission requires a detailed model of the natural history of infectiousness. Unfortunately, direct measures of infectiousness are generally…
Distributed principal component analysis (PCA) produces node-level estimates of both a mean vector and a principal subspace. Robustly aggregating these heterogeneous objects requires a relative scale between mean error and subspace error.…
We develop joint confidence regions for linear regression coefficients when the regressors and errors are jointly stationary and ergodic with unspecified serial dependence. The method applies random smoothing, using an independent auxiliary…
Propensity score (PS) methods are widely used in observational studies to reduce confounding and estimate causal treatment effects. However, the validity of PS-based causal estimators depends heavily on correct model specification, and…
Changepoints are essential for homogenizing categorical time series and analyzing their trends and variations. The original total cloud cover in Canada was recorded hourly in tenths (or eighths), exhibiting inherent seasonality and serial…
Causal mediation analysis is essential for disentangling the mechanisms by which investigational therapeutic and preventive agents impact clinical outcomes. However, the measurement of biological mediators is often subject to left-censoring…
Many functional datasets are observed sparsely and irregularly. Ordering such data is challenging because only limited information is available from each observation, while the underlying trajectories remain infinite-dimensional. This paper…