统计理论 — Scifaro

Multimatricvariate and multimatrix variate distributions based on elliptically contoured laws under real normed division algebras

This paper proposes famillies of multimatricvariate and multimatrix variate distributions based on elliptically contoured laws in the context of real normed division algebras. The work allows to answer the following inference problems about…

统计理论 · 数学 2024-05-14 José A. Díaz-García , Francisco J. Caro-Lopera

Neural Estimation Of Entropic Optimal Transport

Optimal transport (OT) serves as a natural framework for comparing probability measures, with applications in statistics, machine learning, and applied mathematics. Alas, statistical estimation and exact computation of the OT distances…

统计理论 · 数学 2024-05-14 Tao Wang , Ziv Goldfeld

Hypergeometric Distribution Revisited: Tail Inequalities, Confidence Bounds and Sample Sizes

We revisit and refine known tail inequalities and confidence bounds for the hypergeometric distribution, i.e., for the setting where we sample without replacement from a fixed population with binary values or properties. The results are…

统计理论 · 数学 2024-05-14 Anne-Marie George

Probability Tools for Sequential Random Projection

We introduce the first probabilistic framework tailored for sequential random projection, an approach rooted in the challenges of sequential decision-making under uncertainty. The analysis is complicated by the sequential dependence and…

统计理论 · 数学 2024-05-14 Yingru Li

Dually affine Information Geometry modeled on a Banach space

In this chapter, we study Information Geometry from a particular non-parametric or functional point of view. The basic model is a probabilities subset usually specified by regularity conditions. For example, probability measures mutually…

统计理论 · 数学 2024-05-14 Goffredo Chirco , Giovanni Pistone

Spiked eigenvalues of high-dimensional sample autocovariance matrices: CLT and applications

High-dimensional autocovariance matrices play an important role in dimension reduction for high-dimensional time series. In this article, we establish the central limit theorem (CLT) for spiked eigenvalues of high-dimensional sample…

统计理论 · 数学 2024-05-14 Daning Bi , Xiao Han , Adam Nie , Yanrong Yang

U-statistic based on overlapping sample spacings

For testing goodness of fit, we consider a class of U-statistics of overlapping spacings of order two, and investigate their asymptotic properties. The standard U-statistic theory is not directly applicable here as the overlapping spacings…

统计理论 · 数学 2024-05-14 Rahul Singh , Neeraj Misra

Entropic estimation of optimal transport maps

We develop a computationally tractable method for estimating the optimal map between two distributions over $\mathbb{R}^d$ with rigorous finite-sample guarantees. Leveraging an entropic version of Brenier's theorem, we show that our…

统计理论 · 数学 2024-05-14 Aram-Alexandre Pooladian , Jonathan Niles-Weed

Some multivariate goodness of fit tests based on data depth

Using the fact that some depth functions characterize certain family of distribution functions, and under some mild conditions, distribution of the depth is continuous, we have constructed several new multivariate goodness of fit tests…

统计理论 · 数学 2024-05-14 Rahul Singh , Subhajit Dutta , Neeraj Misra

Some parametric tests based on sample spacings

Assume that we have a random sample from an absolutely continuous distribution (univariate, or multivariate) with a known functional form and some unknown parameters. In this paper, we have studied several parametric tests based on…

统计理论 · 数学 2024-05-14 Rahul Singh , Neeraj Misra

Dimension-agnostic inference using cross U-statistics

Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension $d$ while letting the sample size $n$ increase to infinity. Recently, much effort has been dedicated towards…

统计理论 · 数学 2024-05-14 Ilmun Kim , Aaditya Ramdas

Asymptotic Normality of $U$-Statistics is Equivalent to Convergence in the Wasserstein Distance

We prove the claim in the title under mild conditions which are usually satisfied when trying to establish asymptotic normality. We assume strictly stationary and absolutely regular data.

统计理论 · 数学 2024-05-13 Marius Kroll

Finite Sample Analysis and Bounds of Generalization Error of Gradient Descent in In-Context Linear Regression

Recent studies show that transformer-based architectures emulate gradient descent during a forward pass, contributing to in-context learning capabilities - an ability where the model adapts to new tasks based on a sequence of prompt…

统计理论 · 数学 2024-05-13 Karthik Duraisamy

A Novel and Optimal Spectral Method for Permutation Synchronization

Permutation synchronization is an important problem in computer science that constitutes the key step of many computer vision tasks. The goal is to recover $n$ latent permutations from their noisy and incomplete pairwise measurements. In…

统计理论 · 数学 2024-05-13 Duc Nguyen , Anderson Ye Zhang

Fundamental Limits of Spectral Clustering in Stochastic Block Models

Spectral clustering has been widely used for community detection in network sciences. While its empirical successes are well-documented, a clear theoretical understanding, particularly for sparse networks where degrees are much smaller than…

统计理论 · 数学 2024-05-13 Anderson Ye Zhang

Different informational characteristics of cubic transmuted distributions

Cubic transmuted (CT) distributions were introduced recently by \cite{granzotto2017cubic}. In this article, we derive Shannon entropy, Gini's mean difference and Fisher information (matrix) for CT distributions and establish some of their…

统计理论 · 数学 2024-05-13 Shital Saha , Suchandan Kayal , N. Balakrishnan

Consistent Empirical Bayes estimation of the mean of a mixing distribution without identifiability assumption. With applications to treatment of non-response

{\bf Abstract} Consider a Non-Parametric Empirical Bayes (NPEB) setup. We observe $Y_i, \sim f(y|\theta_i)$, $\theta_i \in \Theta$ independent, where $\theta_i \sim G$ are independent $i=1,...,n$. The mixing distribution $G$ is unknown $G…

统计理论 · 数学 2024-05-10 Eitan Greenshtein

Estimation of ill-conditioned models using penalized sums of squares of the residuals

This paper analyzes the estimation of econometric models by penalizing the sum of squares of the residuals with a factor that makes the model estimates approximate those that would be obtained when considering the possible simple…

统计理论 · 数学 2024-05-10 Román Salmerón Gómez , Catalina B. García García

A note on the minimax risk of sparse linear regression

Sparse linear regression is one of the classical and extensively studied problems in high-dimensional statistics and compressed sensing. Despite the substantial body of literature dedicated to this problem, the precise determination of its…

统计理论 · 数学 2024-05-10 Yilin Guo , Shubhangi Ghosh , Haolei Weng , Arian Maleki

Codivergences and information matrices

We propose a new concept of codivergence, which quantifies the similarity between two probability measures $P_1, P_2$ relative to a reference probability measure $P_0$. In the neighborhood of the reference measure $P_0$, a codivergence…

统计理论 · 数学 2024-05-10 Alexis Derumigny , Johannes Schmidt-Hieber