统计理论
This paper proposes famillies of multimatricvariate and multimatrix variate distributions based on elliptically contoured laws in the context of real normed division algebras. The work allows to answer the following inference problems about…
Optimal transport (OT) serves as a natural framework for comparing probability measures, with applications in statistics, machine learning, and applied mathematics. Alas, statistical estimation and exact computation of the OT distances…
We revisit and refine known tail inequalities and confidence bounds for the hypergeometric distribution, i.e., for the setting where we sample without replacement from a fixed population with binary values or properties. The results are…
We introduce the first probabilistic framework tailored for sequential random projection, an approach rooted in the challenges of sequential decision-making under uncertainty. The analysis is complicated by the sequential dependence and…
In this chapter, we study Information Geometry from a particular non-parametric or functional point of view. The basic model is a probabilities subset usually specified by regularity conditions. For example, probability measures mutually…
High-dimensional autocovariance matrices play an important role in dimension reduction for high-dimensional time series. In this article, we establish the central limit theorem (CLT) for spiked eigenvalues of high-dimensional sample…
For testing goodness of fit, we consider a class of U-statistics of overlapping spacings of order two, and investigate their asymptotic properties. The standard U-statistic theory is not directly applicable here as the overlapping spacings…
We develop a computationally tractable method for estimating the optimal map between two distributions over $\mathbb{R}^d$ with rigorous finite-sample guarantees. Leveraging an entropic version of Brenier's theorem, we show that our…
Using the fact that some depth functions characterize certain family of distribution functions, and under some mild conditions, distribution of the depth is continuous, we have constructed several new multivariate goodness of fit tests…
Assume that we have a random sample from an absolutely continuous distribution (univariate, or multivariate) with a known functional form and some unknown parameters. In this paper, we have studied several parametric tests based on…
Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension $d$ while letting the sample size $n$ increase to infinity. Recently, much effort has been dedicated towards…
We prove the claim in the title under mild conditions which are usually satisfied when trying to establish asymptotic normality. We assume strictly stationary and absolutely regular data.
Recent studies show that transformer-based architectures emulate gradient descent during a forward pass, contributing to in-context learning capabilities - an ability where the model adapts to new tasks based on a sequence of prompt…
Permutation synchronization is an important problem in computer science that constitutes the key step of many computer vision tasks. The goal is to recover $n$ latent permutations from their noisy and incomplete pairwise measurements. In…
Spectral clustering has been widely used for community detection in network sciences. While its empirical successes are well-documented, a clear theoretical understanding, particularly for sparse networks where degrees are much smaller than…
Cubic transmuted (CT) distributions were introduced recently by \cite{granzotto2017cubic}. In this article, we derive Shannon entropy, Gini's mean difference and Fisher information (matrix) for CT distributions and establish some of their…
{\bf Abstract} Consider a Non-Parametric Empirical Bayes (NPEB) setup. We observe $Y_i, \sim f(y|\theta_i)$, $\theta_i \in \Theta$ independent, where $\theta_i \sim G$ are independent $i=1,...,n$. The mixing distribution $G$ is unknown $G…
This paper analyzes the estimation of econometric models by penalizing the sum of squares of the residuals with a factor that makes the model estimates approximate those that would be obtained when considering the possible simple…
Sparse linear regression is one of the classical and extensively studied problems in high-dimensional statistics and compressed sensing. Despite the substantial body of literature dedicated to this problem, the precise determination of its…
We propose a new concept of codivergence, which quantifies the similarity between two probability measures $P_1, P_2$ relative to a reference probability measure $P_0$. In the neighborhood of the reference measure $P_0$, a codivergence…