English
Related papers

Related papers: Weighted sampling without replacement

200 papers

Hoeffding has shown that tail bounds on the distribution for sampling from a finite population with replacement also apply to the corresponding cases of sampling without replacement. (A special case of this result is that binomial tail…

Probability · Mathematics 2011-07-11 Kyle J. Luh , Nicholas Pippenger

Joining records with all other records that meet a linkage condition can result in an astronomically large number of combinations due to many-to-many relationships. For such challenging (acyclic) joins, a random sample over the join result…

Databases · Computer Science 2022-01-11 Michael Shekelyan , Graham Cormode , Peter Triantafillou , Ali Shanghooshabad , Qingzhi Ma

Concentration inequalities quantify the deviation of a random variable from a fixed value. In spite of numerous applications, such as opinion surveys or ecological counting procedures, few concentration results are known for the setting of…

Statistics Theory · Mathematics 2015-07-28 Rémi Bardenet , Odalric-Ambrym Maillard

We present a new and simple approach to concentration inequalities for functions around their expectation with respect to non-product measures, i.e., for dependent random variables. Our method is based on coupling ideas and does not use…

Probability · Mathematics 2007-05-23 J. -R. Chazottes , P. Collet , C. Kuelske , F. Redig

Multistage sampling is commonly used for household surveys when there exists no sampling frame, or when the population is scattered over a wide area. Multistage sampling usually introduces a complex dependence in the selection of the final…

Statistics Theory · Mathematics 2015-11-18 Guillaume Chauvet

The concentration of measure phenomenon may be summarized as follows: a function of many weakly dependent random variables that is not too sensitive to any of its individual arguments will tend to take values very close to its expectation.…

Probability · Mathematics 2016-11-18 Aryeh Kontorovich , Maxim Raginsky

We describe a very simple method for `consistent sampling' that allows for sampling with replacement. The method extends previous approaches to consistent sampling, which assign a pseudorandom real number to each element, and sample those…

Data Structures and Algorithms · Computer Science 2018-08-31 Ronald L. Rivest

We study concentration inequalities for structured weighted sums of random data, including (i) tensor inner products and (ii) sequential matrix sums. We are interested in tail bounds and concentration inequalities for those structured…

Statistics Theory · Mathematics 2026-02-11 Chen Cheng , Rina Foygel Barber

Ordered pivotal sampling is one of the simplest algorithm to perform without-replacement unequal probability sampling. It has found uses in the context of longitudinal surveys and spatial sampling, and enables in particular a good spatial…

Statistics Theory · Mathematics 2015-11-02 Guillaume Chauvet

Concentration bounds for non-product, non-Haar measures are fairly recent: the first such result was obtained for contracting Markov chains by Marton in 1996 via the coupling method. The work that followed, with few exceptions, also used…

Probability · Mathematics 2012-07-10 Leonid , Kontorovich

In this work, we present a new random sampling method for data streams where the probability of an element's inclusion in the sample is proportional to a weight associated with that element. Our method is based on sampling with replacement,…

Data Structures and Algorithms · Computer Science 2026-03-18 Adriano Meligrana , Adriano Fazzone

Starting with a set of weighted items, we want to create a generic sample of a certain size that we can later use to estimate the total weight of arbitrary subsets. For this purpose, we propose priority sampling which tested on Internet…

Data Structures and Algorithms · Computer Science 2007-05-23 Nick Duffield , Carsten Lund , Mikkel Thorup

We investigate the low-dimensional structure of deterministic transformations between random variables, i.e., transport maps between probability measures. In the context of statistics and machine learning, these transformations can be used…

Methodology · Statistics 2018-12-18 Alessio Spantini , Daniele Bigoni , Youssef Marzouk

This paper addresses the problem of estimating the containment and similarity between two sets using only random samples from each set, without relying on sketches of full sets. The study introduces a binomial model for predicting the…

Computation · Statistics 2025-07-22 Pranav Joshi

We establish Hoeffding-type concentration inequalities for the low and high tail bounds of sums of exchangeable random variables. Our results exhibit an anti-symmetry in such tail bounds due to the assumption of exchangeability, a…

Optimization and Control · Mathematics 2026-03-12 Nina Maria Gottschling , Michele Caprio

In this work we discuss two urn models with general weight sequences $(A,B)$ associated to them, $A=(\alpha_n)_{n\in\N}$ and $B=(\beta_m)_{m\in\N}$, generalizing two well known P\'olya-Eggenberger urn models, namely the so-called sampling…

Combinatorics · Mathematics 2010-05-11 Markus Kuba

We prove that a sum of random matrices generated by a $\psi$-mixing Markov chain has similar spectral properties to a Gaussian matrix with the same mean and covariance structure. This nonasymptotic universality principle enables sharp…

Probability · Mathematics 2026-04-29 Alexander Van Werde , Jaron Sanders

Time series datasets often contain heterogeneous signals, composed of both continuously changing quantities and discretely occurring events. The coupling between these measurements may provide insights into key underlying mechanisms of the…

Methodology · Statistics 2020-05-11 Shervin Safavi , Nikos K. Logothetis , Michel Besserve

Weighted sampling is a fundamental tool in data analysis and machine learning pipelines. Samples are used for efficient estimation of statistics or as sparse representations of the data. When weight distributions are skewed, as is often the…

Machine Learning · Computer Science 2020-08-18 Edith Cohen , Rasmus Pagh , David P. Woodruff

Initially motivated by the study of the non-asymptotic properties of non-parametric tests based on permutation methods, concentration inequalities for uniformly permuted sums have been largely studied in the literature. Recently, Delyon et…

Probability · Mathematics 2018-05-10 Mélisande Albert
‹ Prev 1 2 3 10 Next ›