统计理论
By now Bayesian methods are routinely used in practice for solving inverse problems. In inverse problems the parameter or signal of interest is observed only indirectly, as an image of a given map, and the observations are typically further…
It is well-known that the approximate factor models have the rotation indeterminacy. It has been considered that the principal component (PC) estimators estimate some rotations of the true factors and factor loadings, but the rotation…
This paper presents a research study focused on uncovering the hidden population distribution from the viewpoint of a variational non-Bayesian approach. It asserts that if the hidden probability density function (PDF) has continuous partial…
We show both adaptive and non-adaptive minimax rates of convergence for a family of weighted Laplacian-Eigenmap based nonparametric regression methods, when the true regression function belongs to a Sobolev space and the sampling density is…
Two models are introduced to investigate graph matching in the presence of corrupt nodes. The weak model, inspired by biological networks, allows one or both networks to have a positive fraction of molecular entities interact randomly with…
I propose Ziv-Zakai-type lower bounds on the Bayesian error for estimating a parameter $\beta:\Theta \to \mathbb R$ when the parameter space $\Theta$ is general and $\beta(\theta)$ need not be a linear function of $\theta$.
We study pointwise estimation and uncertainty quantification for a sparse variational Gaussian process method with eigenvector inducing variables. For a rescaled Brownian motion prior, we derive theoretical guarantees and limitations for…
We study high-dimensional least-squares regression within a subgaussian statistical learning framework with heterogeneous noise. It includes $s$-sparse and $r$-low-rank least-squares regression when a fraction $\epsilon$ of the labels are…
We consider the classical Shiryaev--Roberts martingale diffusion, $(R_t)_{t\ge0}$, restricted to the interval $[0,A]$, where $A>0$ is a preset absorbing boundary. We take yet another look at the well-known phenomenon of quasi-stationarity…
Combining test statistics from independent trials or experiments is a popular method of meta-analysis. However, there is very limited theoretical understanding of the power of the combined test, especially in high-dimensional models…
These lecture notes were written for the course 18.657, High Dimensional Statistics at MIT. They build on a set of notes that was prepared at Princeton University in 2013-14 that was modified (and hopefully improved) over the years.
In this paper, a novel test for testing whether data are Missing Completely at Random is proposed. Asymptotic properties of the test are derived utilizing the theory of non-degenerate U-statistics. It is shown that the novel test statistic…
We present an alternating direction method of multipliers (ADMM) for a generic overlapping group lasso problem, where the groups can be overlapping in an arbitrary way. Meanwhile, we prove the lower bounds and upper bounds for both the…
The paper addresses asymptotic estimation of normal means under sparsity. The primary focus is estimation of multivariate normal means where we obtain exact asymptotic minimax error under global-local shrinkage prior. This extends the…
Augmented block designs for unreplicated test treatments are investigated under the A- and MV-criteria with respect to control versus control, test versus test and control versus test comparisons. We derive design-independent lower bounds…
The original Hotelling-Solomons inequality indicates that an upper bound of |mean - median|/(standard deviation) is 1. In this note, we find a new bound depending on the sample size, which is strictly smaller than 1.
Estimation and inference in statistics pose significant challenges when data are collected adaptively. Even in linear models, the Ordinary Least Squares (OLS) estimator may fail to exhibit asymptotic normality for single coordinate…
This paper is concerned with the problem of conditional independence testing for discrete data. In recent years, researchers have shed new light on this fundamental problem, emphasizing finite-sample optimality. The non-asymptotic viewpoint…
Stein Variational Gradient Descent (SVGD) is a nonparametric particle-based deterministic sampling algorithm. Despite its wide usage, understanding the theoretical properties of SVGD has remained a challenging problem. For sampling from a…
Log-concave sampling has witnessed remarkable algorithmic advances in recent years, but the corresponding problem of proving lower bounds for this task has remained elusive, with lower bounds previously known only in dimension one. In this…