English
Related papers

Related papers: (Learned) Frequency Estimation Algorithms under Zi…

200 papers

Estimating frequencies of elements appearing in a data stream is a key task in large-scale data analysis. Popular sketching approaches to this problem (e.g., CountMin and CountSketch) come with worst-case guarantees that probabilistically…

Data Structures and Algorithms · Computer Science 2023-12-13 Anders Aamand , Justin Y. Chen , Huy Lê Nguyen , Sandeep Silwal , Ali Vakilian

Frequency estimation is one of the most fundamental problems in streaming algorithms. Given a stream $S$ of elements from some universe $U=\{1 \ldots n\}$, the goal is to compute, in a single pass, a short sketch of $S$ so that for any…

Data Structures and Algorithms · Computer Science 2021-11-09 Piotr Indyk , Shyam Narayanan , David P. Woodruff

The Count-Min sketch is an important and well-studied data summarization method. It allows one to estimate the count of any item in a stream using a small, fixed size data sketch. However, the accuracy of the sketch depends on…

Data Structures and Algorithms · Computer Science 2018-11-13 Daniel Ting

Demands are increasing to measure per-flow statistics in the data plane of high-speed switches. Measuring flows with exact counting is infeasible due to processing and memory constraints, but a sketch is a promising candidate for collecting…

Networking and Internet Architecture · Computer Science 2021-11-05 SunYoung Kim , Changhun Jung , RhongHo Jang , David Mohaisen , DaeHun Nyang

Conservative Count-Min, an improved version of Count-Min sketch [Cormode, Muthukrishnan 2005], is an online-maintained hashing-based data structure summarizing element frequency information without storing elements themselves. Although…

Data Structures and Algorithms · Computer Science 2023-09-08 Éric Fusy , Gregory Kucherov

Frequency estimation in streaming data often relies on sketches like Count-Min (CM) to provide approximate answers with sublinear space. However, CM sketches introduce additive errors that disproportionately impact low-frequency elements,…

Data Structures and Algorithms · Computer Science 2025-05-27 Nima Shahbazi , Stavros Sintos , Abolfazl Asudeh

Count-Min Sketch with Conservative Updates (CMS-CU) is a popular algorithm to approximately count items' appearances in a data stream. Despite CMS-CU's widespread adoption, the theoretical analysis of its performance is still wanting…

Discrete Mathematics · Computer Science 2022-03-29 Younes Ben Mazziane , Sara Alouf , Giovanni Neglia

This paper develops conformal inference methods to construct a confidence interval for the frequency of a queried object in a very large discrete data set, based on a sketch with a lower memory footprint. This approach requires no knowledge…

Methodology · Statistics 2023-08-17 Matteo Sesia , Stefano Favaro , Edgar Dobriban

Count-Min Sketch with Conservative Updates (CMS-CU) is a memory-efficient hash-based data structure used to estimate the occurrences of items within a data stream. CMS-CU stores $m$ counters and employs $d$ hash functions to map items to…

Data Structures and Algorithms · Computer Science 2024-05-22 Younes Ben Mazziane , Othmane Marfoq

Count-Min Sketch is a widely adopted algorithm for approximate event counting in large scale processing. However, the original version of the Count-Min-Sketch (CMS) suffers of some deficiences, especially if one is interested by the…

Information Retrieval · Computer Science 2015-02-18 Guillaume Pitel , Geoffroy Fouquier

We present a novel approach for the problem of frequency estimation in data streams that is based on optimization and machine learning. Contrary to state-of-the-art streaming frequency estimation algorithms, which heavily rely on random…

Data Structures and Algorithms · Computer Science 2022-07-19 Dimitris Bertsimas , Vassilis Digalakis

A flexible conformal inference method is developed to construct confidence intervals for the frequencies of queried objects in very large data sets, based on a much smaller sketch of those data. The approach is data-adaptive and requires no…

Methodology · Statistics 2022-11-10 Matteo Sesia , Stefano Favaro

An influential paper of Hsu et al. (ICLR'19) introduced the study of learning-augmented streaming algorithms in the context of frequency estimation. A fundamental problem in the streaming literature, the goal of frequency estimation is to…

Machine Learning · Computer Science 2025-03-04 Anders Aamand , Justin Y. Chen , Siddharth Gollapudi , Sandeep Silwal , Hao Wu

Recently there has been increased interest in using machine learning techniques to improve classical algorithms. In this paper we study when it is possible to construct compact, composable sketches for weighted sampling and statistics…

Data Structures and Algorithms · Computer Science 2021-11-04 Edith Cohen , Ofir Geri , Rasmus Pagh

We consider massive distributed datasets that consist of elements modeled as key-value pairs and the task of computing statistics or aggregates where the contribution of each key is weighted by a function of its frequency (sum of values of…

Data Structures and Algorithms · Computer Science 2019-12-24 Edith Cohen , Ofir Geri

The sliding window model of computation captures scenarios in which data are continually arriving in the form of a stream, and only the most recent $w$ items are used for analysis. In this setting, an algorithm needs to accurately track…

Cryptography and Security · Computer Science 2024-06-13 Yiping Wang , Yanhao Wang , Cen Chen

In data stream applications, one of the critical issues is to estimate the frequency of each item in the specific multiset. The multiset means that each item in this set can appear multiple times. The data streams in many applications are…

Data Structures and Algorithms · Computer Science 2020-01-07 Ning Li

Sketching is a probabilistic data compression technique that has been largely developed in the computer science community. Numerical operations on big datasets can be intolerably slow; sketching algorithms address this issue by generating a…

Methodology · Statistics 2019-04-04 Daniel Ahfock , William J. Astle , Sylvia Richardson

Sketches are probabilistic data structures that can provide approximate results within mathematically proven error bounds while using orders of magnitude less memory than traditional approaches. They are tailored for streaming data analysis…

Data Structures and Algorithms · Computer Science 2019-03-05 Fatih Taşyaran , Kerem Yıldırır , Kamer Kaya , Mustafa Kemal Taş

Frequency estimation of elements is an important task for summarizing data streams and machine learning applications. The problem is often addressed by using streaming algorithms with sublinear space data structures. These algorithms allow…

Data Structures and Algorithms · Computer Science 2022-04-05 Nikita Seleznev , Senthil Kumar , C. Bayan Bruss
‹ Prev 1 2 3 10 Next ›