统计理论
Classification is an important statistical learning tool. In real application, besides high prediction accuracy, it is often desirable to estimate class conditional probabilities for new observations. For traditional problems where the…
We study the minimax rate of estimation in nonparametric exponential family regression under star-shaped constraints. Specifically, the parameter space $K$ is a star-shaped set contained within a bounded box $[-M, M]^n$, where $M$ is a…
We prove that the only admissible way of merging arbitrary e-values is to use a weighted arithmetic average. This result completes the picture of merging methods for e-values, and generalizes the result of Vovk and Wang (2021, Annals of…
Stochastic inverse problems considered in this article consist of estimating the probability distributions of intrinsically random inputs of computer models. These estimations are based on observable outputs affected by model noise, and…
We propose another proof of the high dimensional spectrum convergence of the weighted sample covariance, more concise and self-sufficient but with stronger, but reasonable assumptions. We explain and illustrates this theorem for different…
The success of large-scale models in recent years has increased the importance of statistical models with numerous parameters. Several studies have analyzed over-parameterized linear models with high-dimensional data, which may not be…
Structural changes occur in dynamic networks quite frequently and its detection is an important question in many situations such as fraud detection or cybersecurity. Real-life networks are often incompletely observed due to individual…
We consider two random variables $X$ and $Y$ following correlated Gamma distributions, characterized by identical scale and shape parameters and a linear correlation coefficient $\rho$. Our focus is on the parameter: \[ D(X,Y) = \frac{|X -…
Spectral estimation is a fundamental problem for time series analysis, which is widely applied in economics, speech analysis, seismology, and control systems. The asymptotic convergence theory for classical, non-parametric estimators, is…
Multi-target linear shrinkage is an extension of the standard single-target linear shrinkage for covariance estimation. We combine several constant matrices - the targets - with the sample covariance matrix. We derive the oracle and a…
We provide non asymptotic rates of convergence of the Wasserstein Generative Adversarial networks (WGAN) estimator. We build neural networks classes representing the generators and discriminators which yield a GAN that achieves the minimax…
Early work established convergence of the principal component estimators of the factors and loadings up to a rotation for large dimensional approximate factor models with weak factors in that the factor loading $\Lambda^{(0)}$ scales…
Log-linear exponential random graph models are a specific class of statistical network models that have a log-linear representation. This class includes many stochastic blockmodel variants. In this paper, we focus on $\beta$-stochastic…
Random forest methods belong to the class of non-parametric machine learning algorithms. They were first introduced in 2001 by Breiman and they perform with accuracy in high dimensional settings. In this article, we consider, a simplified…
We study the large-width asymptotics of random fully connected neural networks with weights drawn from $\alpha$-stable distributions, a family of heavy-tailed distributions arising as the limiting distributions in the Gnedenko-Kolmogorov…
Oja's algorithm for Streaming Principal Component Analysis (PCA) for $n$ data-points in a $d$ dimensional space achieves the same sin-squared error $O(r_{\mathsf{eff}}/n)$ as the offline algorithm in $O(d)$ space and $O(nd)$ time and a…
This work addresses large dimensional covariance matrix estimation with unknown mean. The empirical covariance estimator fails when dimension and number of samples are proportional and tend to infinity, settings known as Kolmogorov…
A common observation in data-driven applications is that high-dimensional data have a low intrinsic dimension, at least locally. In this work, we consider the problem of point estimation for manifold-valued data. Namely, given a finite set…
We construct a family of estimators for a regression function based on a sample following a qdistribution. Our approach is nonparametric, using kernel methods built from operations that leverage the properties of q-calculus. Furthermore,…
Community detection, which focuses on recovering the group structure within networks, is a crucial and fundamental task in network analysis. However, the detection process can be quite challenging and unstable when community signals are…