Weighted Random Sampling over Data Streams
Data Structures and Algorithms
2015-07-29 v2
Abstract
In this work, we present a comprehensive treatment of weighted random sampling (WRS) over data streams. More precisely, we examine two natural interpretations of the item weights, describe an existing algorithm for each case ([2, 4]), discuss sampling with and without replacement and show adaptations of the algorithms for several WRS problems and evolving data streams.
Cite
@article{arxiv.1012.0256,
title = {Weighted Random Sampling over Data Streams},
author = {Pavlos S. Efraimidis},
journal= {arXiv preprint arXiv:1012.0256},
year = {2015}
}
Comments
Corrected minor typos. Infeasible items are now additionally called "overweight" items (WRS-N-P). Enriched the Introduction (Section 1) with more text and references to related work. Revised the description of sampling with a bounded number of replacements (Section 4.2)