统计理论
We derive explicit upper bounds for the number of nondegenerate critical points of a $k$-component Gaussian mixture density in $\mathbb{R}^d$, and the number of modes when the modal set is finite, together with lower bounds. By normalizing…
Operator-valued concentration inequalities are foundational to the analysis of modern high-dimensional statistics and randomized algorithms. However, standard oracle bounds are frequently limited in practice: they require explicit a priori…
For a phenomenon $\boldsymbol{f}$ that is a function of $n$ factors, defined on a finite abelian group $G$, we derive its population statistics solely from its Fourier transform $\hat{\boldsymbol{f}}$. Our main result is an…
We show that high-accuracy guarantees for log-concave sampling -- that is, iteration and query complexities which scale as $\mathrm{poly}\log(1/\delta)$, where $\delta$ is the desired target accuracy -- are achievable using stochastic…
In this paper, we apply the Feature Space Decomposition (FSD) method developed in [LS24, GLS25, LSSW26, ALSS26] to obtain, under fairly general conditions, matching upper and lower bounds for the population excess risk of spectral methods…
Distributional regression aims to find the best candidate in a given parametric family of conditional distributions to model a given dataset. As each candidate in the distribution family can be identified by the corresponding distribution…
Long-term outcomes are often unavailable in randomized clinical trials, although short-term surrogate outcomes are commonly observed. External observational data may contain the long-term outcome, but causal comparisons based on such data…
We study optimal monopoly pricing over consumer networks governed by general nonlinear utilities. In our framework, a consumer's utility is jointly determined by an individualized price and the consumption choices of their peers, propagated…
Accurately estimating the proportion of true signals among a large number of variables is crucial for enhancing the precision and reliability of scientific research. Traditional signal proportion estimators often assume independence among…
High-dimensional planted problems, such as finding a hidden dense subgraph within a random graph, often exhibit a gap between statistical and computational feasibility. While recovering the hidden structure may be statistically possible, it…
Mixture model-based frameworks are very popular for statistical inference in clustering. While convenient for producing probabilistic estimates of cluster assignments and uncertainty, they are prone to misspecification, which can lead to…
The maximum likelihood threshold of a statistical model is the minimum number of datapoints required to fit the model via maximum likelihood estimation. In this paper we determine the maximum likelihood thresholds of generic linear…
The $\gamma$-divergence is well-known for having strong robustness against heavy contamination. By virtue of this property, many applications via the $\gamma$-divergence have been proposed. There are two types of \gd\ for regression…
In statistics permutations typically arise in the context of rank plots for two-dimensional data. Such plots can also be interpreted as discrete copulas. In discrete mathematics, typically in the context of the description of large…
Estimation of the extreme value index under right censoring is a fundamental problem in extreme value theory, with important applications in finance, insurance, and reliability. Classical integral estimators for Pareto-type tails typically…
It has been recently shown that e-processes are sufficient for sequential testing in the following sense: every level-$\alpha$ sequential test can be obtained by thresholding an e-process at $1/\alpha$. However, in the above result, neither…
Zador's celebrated theorem is a cornerstone of optimal quantisation, establishing both the weak limit of the empirical distribution of an $n$-point optimal quantiser in $R^d$ and the decay rate of the associated $L_s$-mean quantisation…
The quantitative analysis of financial time series often reveals two distinct features that standard Gaussian frameworks fail to capture: heavy-tailed marginal distributions and the phenomenon of extreme co-movements.While extreme value…
\We introduce the horospherical depth, an intrinsic notion of statistical depth on Hadamard manifolds, and define the Busemann median as the set of its maximizers. The construction exploits the fact that the linear functionals appearing in…
This paper addresses the statistical estimation of Gaussian Mixture Models (GMMs) with unknown diagonal covariances from independent and identically distributed samples. We employ the Beurling-LASSO (BLASSO), a convex optimization framework…