统计理论
We derive high-dimensional Gaussian comparison results for the standard $V$-fold cross-validated risk estimates. Our results combine a recent stability-based argument for the low-dimensional central limit theorem of cross-validation with…
This paper studies optimal estimation of large-dimensional nonlinear factor models. The key challenge is that the observed variables are possibly nonlinear functions of some latent variables where the functional forms are left unspecified.…
The purpose of the present paper is to give unified expressions to the characteristic functions of all elliptical and related distributions. Those distributions including the multivariate elliptical symmetric distributions and some…
For the differential privacy under the sub-Gamma noise, we derive the asymptotic properties of a class of network models with binary values with a general link function. In this paper, we release the degree sequences of the binary networks…
Two-sample tests utilizing a similarity graph on observations are useful for high-dimensional and non-Euclidean data due to their flexibility and good performance under a wide range of alternatives. Existing works mainly focused on sparse…
The Benjamini-Hochberg (BH) procedure remains widely popular despite having limited theoretical guarantees in the commonly encountered scenario of correlated test statistics. Of particular concern is the possibility that the method could…
The present paper introduces a fully objective Bayesian analysis to obtain the posterior distribution of an entropy measure. Notably, we consider the gamma distribution, which describes many natural phenomena in physics, engineering, and…
In this paper, we develop a simple non-parametric test for testing normal distribution based on the distance between empirical zero-bias transformation and empirical distribution. The asymptotic properties of the test statistic are studied.…
Clustering of extreme events can have profound and detrimental societal consequences. The extremal index, a number in the unit interval, is a key parameter in modelling the clustering of extremes. The study of extremal index often assumes a…
We investigate a class of recovery problems for which observations are a noisy combination of continuous and step functions. These problems can be seen as non-injective instances of non-linear ICA with direct applications to image…
Large datasets are often affected by cell-wise outliers in the form of missing or erroneous data. However, discarding any samples containing outliers may result in a dataset that is too small to accurately estimate the covariance matrix.…
In this paper, we study the estimation of the derivative of a regression function in a standard univariate regression model. The estimators are defined either by derivating nonparametric least-squares estimators of the regression function…
Robust methods, though ubiquitous in practice, are yet to be fully understood in the context of regularized estimation and high dimensions. Even simple questions become challenging very quickly. For example, classical statistical theory…
Sequential data collection has emerged as a widely adopted technique for enhancing the efficiency of data gathering processes. Despite its advantages, such data collection mechanism often introduces complexities to the statistical inference…
We examine the behaviour of the Laplace and saddlepoint approximations in the high-dimensional setting, where the dimension of the model is allowed to increase with the number of observations. Approximations to the joint density, the…
Uncertainty quantification and false selection error rate (FSR) control are crucial in many high-consequence scenarios, so we need models with good interpretability. This article introduces the optimality function for the binary…
Donoho and Kipnis (2022) showed that the the higher criticism (HC) test statistic has a non-Gaussian phase transition but remarked that it is probably not optimal, in the detection of sparse differences between two large frequency tables…
Local learning rules in biological neural networks (BNNs) are commonly referred to as Hebbian learning. [26] links a biologically motivated Hebbian learning rule to a specific zeroth-order optimization method. In this work, we study a…
We propose a new class of Markov chain Monte Carlo methods, called $k$-polar slice sampling ($k$-PSS), as a technical tool that interpolates between and extrapolates beyond uniform and polar slice sampling. By examining Wasserstein…
Polar slice sampling, a Markov chain construction for approximate sampling, performs, under suitable assumptions on the target and initial distribution, provably independent of the state space dimension. We extend the aforementioned result…