统计理论
Sparse additive models are an attractive choice in circumstances calling for modelling flexibility in the face of high dimensionality. We study the signal detection problem and establish the minimax separation rate for the detection of a…
The classical $k$-means clustering requires a complete data matrix without missing entries. As a natural extension of the $k$-means clustering for missing data, the $k$-POD clustering has been proposed, which ignores the missing entries in…
We consider a linear regression model with complex-valued response and predictors from a compact and connected Lie group. The regression model is formulated in terms of eigenfunctions of the Laplace-Beltrami operator on the Lie group. We…
Bayesian statistics has two common measures of central tendency of a posterior distribution: posterior means and Maximum A Posteriori (MAP) estimates. In this paper, we discuss a connection between MAP estimates and posterior means. We…
This paper considers the problem of inferring the causal effect of a variable $Z$ on a dependently censored survival time $T$. We allow for unobserved confounding variables, such that the error term of the regression model for $T$ is…
We consider a class of conditional forward-backward diffusion models for conditional generative modeling, that is, generating new data given a covariate (or control variable). To formally study the theoretical properties of these…
In this article, we study the asymptotic behaviour of the residual autocorrelations for periodic vector autoregressive time series models (PVAR henceforth) with uncorrelated but dependent innovations (i.e., weak PVAR). We then deduce the…
This work delves into presenting a probabilistic method for analyzing linear process data with weakly dependent innovations, focusing on detecting change-points in the mean and estimating its spectral density. We develop a test for…
In the standard formulation of the denoising problem, one is given a probabilistic model relating a latent variable $\Theta \in \Omega \subset \mathbb{R}^m \; (m\ge 1)$ and an observation $Z \in \mathbb{R}^d$ according to: $Z \mid \Theta…
Treatment effect estimation under unconfoundedness is a fundamental task in causal inference. In response to the challenge of analyzing high-dimensional datasets collected in substantive fields such as epidemiology, genetics, economics, and…
A common task in high-throughput biology is to screen for associations across thousands of units of interest, e.g., genes or proteins. Often, the data for each unit are modeled as Gaussian measurements with unknown mean and variance and are…
For statistical analysis of network data, the $\beta$-model has emerged as a useful tool, thanks to its flexibility in incorporating nodewise heterogeneity and theoretical tractability. To generalize the $\beta$-model, this paper proposes…
Estimating a covariance matrix and its associated principal components is a fundamental problem in contemporary statistics. While optimal estimation procedures have been developed with well-understood properties, the increasing demand for…
Kolmogorov-Smirnov (KS) tests rely on the convergence to zero of the KS-distance $d(F_n,G)$ in the one sample case, and of $d(F_n,G_m)$ in the two sample case. In each case the assumption (the null hypothesis) is that $F=G$, and so…
The distribution functions of the matricvariate beta type I and II distributions are studied under real normed division algebras. The unified approach for real, complex, quaternions and octonions, also considers general properties and…
We exhibit the analog of the entropy map for multivariate Gaussian distributions on local fields. As in the real case, the image of this map lies in the supermodular cone and it determines the distribution of the valuation vector. In…
We adapt Gaussian processes for estimating the average dose-response function in observational settings, introducing a powerful complement to treatment effect estimation for understanding heterogeneous effects. We incorporate samples from a…
In this paper, we consider the tail probabilities of extremals of $\beta$-Jacobi ensemble which plays an important role in multivariate analysis. The key steps in constructing estimators rely on the rate functions of large deviations.…
We study the optimization landscape of deep linear neural networks with the square loss. It is known that, under weak assumptions, there are no spurious local minima and no local maxima. However, the existence and diversity of non-strict…
We study estimation of large Dynamic Factor models implemented through the Expectation Maximization (EM) algorithm, jointly with the Kalman smoother. We prove that as both the cross-sectional dimension, $n$, and the sample size, $T$,…