统计理论
For differences between means of continuous data from independent groups, the customary scale-free measure of effect is the standardized mean difference (SMD). To justify use of SMD, one should be reasonably confident that the group-level…
We study causal effect estimation under interference from network data. We work under the chain-graph formulation pioneered in Tchetgen Tchetgen et. al (2021). Our first result shows that polynomial time evaluation of treatment effects is…
Theorem 5.1 in the monograph by Hall (1992) provides rigorous in-probability justification of Edgeworth expansions of bootstrap distributions. Proving this result was rather challenging because bootstrap distributions do not satisfy the…
In the Bayesian literature, a line of research called resolution of conflict is about the characterization of robustness against outliers of statistical models. The robustness characterization of a model is achieved by establishing the…
We consider the seriation problem, whose goal is to recover a hidden ordering from a noisy observation of a permuted Robinson matrix. We establish sharp minimax rates under average-Lipschitz conditions that strictly extend the bi-Lipschitz…
Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion with invariant measure $\text{d}\mu_\phi \propto e^{-\phi} \mathrm{dvol}_g $ on a compact Riemannian manifold. Two…
Several statistical problems, such as multiple heterogeneous graph analysis, distributed PCA, integrative data analysis, and simultaneous dimension reduction of images, can involve a collection of $m$ matrices whose leading subspaces…
In this article, we propose a generic screening method for selecting explanatory variables correlated with the response variable Y . We make no assumptions about the existence of a model that could link Y with a subset of explanatory…
We consider the problem of estimating the probability density function of a circular random variable observed under censoring. To this end, we introduce a projection estimator constructed via a regression approach on linear sieves. We first…
We investigate parametric estimation of the elasticity parameter in the CKLS diffusion based on high-frequency data. First, we transform the CKLS diffusion to a CIR-type one via a smooth state-space mapping and the general Girsanov change…
The main objective of this paper is to propose a new approach for estimating the entire collection of Sobol' indices simultaneously. Our approach exploits the fact that Sobol' indices can be rewritten as solutions to an optimization problem…
Semi-implicit variational inference (SIVI) constructs approximate posteriors of the form $q(\theta) = \int k(\theta | z) r(dz)$, where the conditional kernel is parameterized and the mixing base is fixed and tractable. This paper develops a…
This work analyzes the subspace-constrained Tyler's estimator (STE), a method designed to recover a low-dimensional subspace from a dataset that may be heavily corrupted by outliers. The STE has previously been shown to be competitive for…
This paper introduces a novel extension of Fr\'{e}chet means, called \textit{generalized Fr\'{e}chet means} as a comprehensive framework for characterizing features in probability distributions in general topological spaces. The generalized…
This study explores the estimation of parameters in a matrix-valued linear regression model, where the $T$ responses $(Y_t)_{t=1}^T \in \mathbb{R}^{n \times p}$ and predictors $(X_t)_{t=1}^T \in \mathbb{R}^{m \times q}$ satisfy the…
In this article, we introduce the novel concept of the second maximum of a Gaussian random field on a Riemannian submanifold. This second maximum serves as a powerful tool for characterizing the distribution of the maximum. By utilizing an…
Covariate adjustment is a ubiquitous method used to estimate the average treatment effect (ATE) from observational data. Assuming a known graphical structure of the data generating model, recent results give graphical criteria for optimal…
Recent work in dynamic causal inference introduced a class of discrete-time stochastic processes that generalize martingale difference sequences and arrays as follows: the random variates in each sequence have expectation zero given certain…
High-order clustering aims to classify objects in multiway datasets that are prevalent in various fields such as bioinformatics, recommendation systems, and social network analysis. Such data are often sparse and high-dimensional, posing…
We propose methods to estimate the individual $\beta$-mixing coefficients of a real-valued geometrically ergodic Markov process from a single sample-path $X_0,X_1, \dots,X_n$. Under standard smoothness conditions on the densities, namely,…