统计理论
We consider the problem of approximating a general Gaussian location mixture by finite mixtures. The minimum order of finite mixtures that achieve a prescribed accuracy (measured by various $f$-divergences) is determined within constant…
I prove a semiparametric Bernstein-von Mises theorem for a partially linear regression model with independent priors for the low-dimensional parameter of interest and the infinite-dimensional nuisance parameters. My result avoids a…
Given i.i.d. observations uniformly distributed on a closed submanifold of the Euclidean space, we study higher-order generalizations of graph Laplacians, so-called Hodge Laplacians on graphs, as approximations of the Laplace-Beltrami…
Nonparametric regression with random design is considered. The $L_2$ error with integration with respect to the design measure is used as the error criterion. An over-parametrized deep neural network regression estimate with logistic…
Accurate tuning of hyperparameters is crucial to ensure that models can generalise effectively across different settings. In this paper, we present theoretical guarantees for hyperparameter selection using variational Bayes in the…
For linear inverse problems with Gaussian priors and Gaussian observation noise, the posterior is Gaussian, with mean and covariance determined by the conditioning formula. Using the Feldman-Hajek theorem, we analyse the prior-to-posterior…
The non-parametric version of Amari's dually affine Information Geometry provides a practical calculus to perform computations of interest in statistical machine learning. The method uses the notion of a statistical bundle, a mathematical…
We consider the problem of sequential hypothesis testing by betting. For a general class of composite testing problems -- which include bounded mean testing, equal mean testing for bounded random tuples, and some key ingredients of…
This paper introduces a periodic multivariate Poisson autoregression with potentially infinite memory, with a special focus on the network setting. Using contraction techniques, we study the stability of such a process and provide upper…
Large-scale multiple testing under static factor models is widely used to detect sparse signals in high-dimensional data. However, static factor models are arguably too stringent because they ignore serial correlation, which seriously…
Heavy tails are often found in practice, and yet they are an Achilles heel of a variety of mainstream random probability measures such as the Dirichlet process (DP). The first contribution of this paper focuses on characterizing the tails…
This paper focuses on nonparametric statistical inference of the hazard rate function of discrete distributions based on $\delta$-record data. We derive the explicit expression of the maximum likelihood estimator and determine its exact…
The goal of this paper is to propose a new approach to asymptotic analysis of the finite predictor for stationary sequences. It produces the exact asymptotics of the relative prediction error and the partial correlation coefficients. The…
Empirical likelihood serves as a powerful tool for constructing confidence intervals in nonparametric regression and regression discontinuity designs (RDD). The original empirical likelihood framework can be naturally extended to these…
Multiple works regarding convergence analysis of Markov chains have led to spectral gap decomposition formulas of the form \[ \mathrm{Gap}(S) \geq c_0 \left[\inf_z \mathrm{Gap}(Q_z)\right] \mathrm{Gap}(\bar{S}), \] where $c_0$ is a…
Multiparameter persistent homology is a generalization of classical persistent homology, a central and widely-used methodology from topological data analysis, which takes into account density estimation and is an effective tool for data…
In survival analysis, the estimation of the proportion of subjects who will never experience the event of interest, termed the cure rate, has received considerable attention recently. Its estimation can be a particularly difficult task when…
Spectral estimation is an important tool in time series analysis, with applications including economics, astronomy, and climatology. The asymptotic theory for non-parametric estimation is well-known but the development of non-asymptotic…
Spectral analysis plays a crucial role in high-dimensional statistics, where determining the asymptotic distribution of various spectral statistics remains a challenging task. Due to the difficulties of deriving the analytic form, recent…
We establish a bijection between marginal independence models on $n$ random variables and split closed order ideals in the poset of partial set partitions. We also establish that every discrete marginal independence model is toric in cdf…