English
Related papers

Related papers: Design based incomplete U-statistics

200 papers

This paper studies inference for the mean vector of a high-dimensional $U$-statistic. In the era of Big Data, the dimension $d$ of the $U$-statistic and the sample size $n$ of the observations tend to be both large, and the computation of…

Statistics Theory · Mathematics 2019-01-29 Xiaohui Chen , Kengo Kato

Incomplete U-statistics have been proposed to accelerate computation. They use only a subset of the subsamples required for kernel evaluations by complete U-statistics. This paper gives a finite sample bound in the style of Bernstein's…

Statistics Theory · Mathematics 2022-07-08 Andreas Maurer

U-statistics are a fundamental class of estimators that generalize the sample mean and underpin much of nonparametric statistics. Although extensively studied in both statistics and probability, key challenges remain: their high…

Statistics Theory · Mathematics 2026-02-19 Cesare Miglioli , Jordan Awan

Higher-order $U$-statistics abound in fields such as statistics, machine learning, and computer science, but are known to be highly time-consuming to compute in practice. Despite their widespread appearance, a comprehensive study of their…

Machine Learning · Statistics 2026-04-01 Xingyu Chen , Ruiqi Zhang , Lin Liu

Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension $d$ while letting the sample size $n$ increase to infinity. Recently, much effort has been dedicated towards…

Statistics Theory · Mathematics 2024-05-14 Ilmun Kim , Aaditya Ramdas

Clustering methods are a valuable tool for the identification of patterns in high dimensional data with applications in many scientific problems. However, quantifying uncertainty in clustering is a challenging problem, particularly when…

Methodology · Statistics 2018-06-01 Marcio Valk , Gabriela Bettella Cybis

The standard A/B testing approaches are mostly based on t-test in large scale industry applications. These standard approaches however suffers from low statistical power in business settings, due to nature of small sample-size or…

Methodology · Statistics 2025-12-30 Changshuai Wei , Phuc Nguyen , Benjamin Zelditch , Joyce Chen

Semi-supervised datasets are ubiquitous across diverse domains where obtaining fully labeled data is costly or time-consuming. The prevalence of such datasets has consistently driven the demand for new tools and methods that exploit the…

Statistics Theory · Mathematics 2024-03-12 Ilmun Kim , Larry Wasserman , Sivaraman Balakrishnan , Matey Neykov

In a wide range of statistical learning problems such as ranking, clustering or metric learning among others, the risk is accurately estimated by $U$-statistics of degree $d\geq 1$, i.e. functionals of the training data with low variance…

Machine Learning · Statistics 2019-01-25 Stéphan Clémençon , Aurélien Bellet , Igor Colin

Existing two-sample testing techniques, particularly those based on choosing a kernel for the Maximum Mean Discrepancy (MMD), often assume equal sample sizes from the two distributions. Applying these methods in practice can require…

Machine Learning · Statistics 2025-12-17 Aaron Wei , Milad Jalali , Danica J. Sutherland

U-statistics play central roles in many statistical learning tools but face the haunting issue of scalability. Significant efforts have been devoted into accelerating computation by U-statistic reduction. However, existing results almost…

Methodology · Statistics 2023-06-07 Meijia Shao , Dong Xia , Yuan Zhang

We propose the use of U-statistics to reduce variance for gradient estimation in importance-weighted variational inference. The key observation is that, given a base gradient estimator that requires $m > 1$ samples and a total of $n > m$…

Machine Learning · Computer Science 2023-02-28 Javier Burroni , Kenta Takatsu , Justin Domke , Daniel Sheldon

This paper is mainly concerned with asymptotic studies of weighted bootstrap for u- and v-statistics. We derive the consistency of the weighted bootstrap u- and v-statistics, based on i.i.d. and non i.i.d. observations, from some more…

Statistics Theory · Mathematics 2012-10-23 Miklos Csorgo , Masoud M. Nasari

The EM algorithm is one of many important tools in the field of statistics. While often used for imputing missing data, its widespread applications include other common statistical tasks, such as clustering. In clustering, the EM algorithm…

Machine Learning · Statistics 2017-11-22 Val Andrei Fajardo , Jiaxi Liang

This article introduces subbagging (subsample aggregating) estimation approaches for big data analysis with memory constraints of computers. Specifically, for the whole dataset with size $N$, $m_N$ subsamples are randomly drawn, and each…

Methodology · Statistics 2021-03-05 Tao Zou , Xian Li , Xuan Liang , Hansheng Wang

$U$-statistics play a central role in statistical inference. In many modern applications, however, acquiring the labels required for $U$-statistics is costly. Motivated by recent advances in active inference, we develop an active inference…

Machine Learning · Statistics 2026-05-13 Xiaoning Wang , Yuyang Huo , Liuhua Peng , Changliang Zou

Bootstrap for nonlinear statistics like U-statistics of dependent data has been studied by several authors. This is typically done by producing a bootstrap version of the sample and plugging it into the statistic. We suggest an alternative…

Statistics Theory · Mathematics 2015-05-28 Olimjon Sh. Sharipov , Johannes Tewes , Martin Wendler

The notion of a $U$-statistic for an $n$-tuple of identical quantum systems is introduced in analogy to the classical (commutative) case: given a selfadjoint `kernel' $K$ acting on $(\mathbb{C}^{d})^{\otimes r}$ with $r<n$, we define the…

Quantum Physics · Physics 2011-06-23 Madalin Guta , Cristina Butucea

We introduce a new method of estimation of parameters in semiparametric and nonparametric models. The method is based on estimating equations that are $U$-statistics in the observations. The $U$-statistics are based on higher order…

We study the problem of distributional approximations to high-dimensional non-degenerate $U$-statistics with random kernels of diverging orders. Infinite-order $U$-statistics (IOUS) are a useful tool for constructing simultaneous prediction…

Statistics Theory · Mathematics 2019-12-11 Yanglei Song , Xiaohui Chen , Kengo Kato
‹ Prev 1 2 3 10 Next ›