English
Related papers

Related papers: Frequency Estimation with One-Sided Error

200 papers

\begin{abstract} The frequencies of the elements in a data stream are an important statistical measure and the task of estimating them arises in many applications within data analysis and machine learning. Two of the most popular algorithms…

Data Structures and Algorithms · Computer Science 2020-08-12 Anders Aamand , Piotr Indyk , Ali Vakilian

Frequency estimation in streaming data often relies on sketches like Count-Min (CM) to provide approximate answers with sublinear space. However, CM sketches introduce additive errors that disproportionately impact low-frequency elements,…

Data Structures and Algorithms · Computer Science 2025-05-27 Nima Shahbazi , Stavros Sintos , Abolfazl Asudeh

Estimating frequencies of elements appearing in a data stream is a key task in large-scale data analysis. Popular sketching approaches to this problem (e.g., CountMin and CountSketch) come with worst-case guarantees that probabilistically…

Data Structures and Algorithms · Computer Science 2023-12-13 Anders Aamand , Justin Y. Chen , Huy Lê Nguyen , Sandeep Silwal , Ali Vakilian

The Count-Min sketch is an important and well-studied data summarization method. It allows one to estimate the count of any item in a stream using a small, fixed size data sketch. However, the accuracy of the sketch depends on…

Data Structures and Algorithms · Computer Science 2018-11-13 Daniel Ting

This paper resolves one of the longest standing basic problems in the streaming computational model. Namely, optimal construction of quantile sketches. An $\varepsilon$ approximate quantile sketch receives a stream of items $x_1,\ldots,x_n$…

Data Structures and Algorithms · Computer Science 2016-04-07 Zohar Karnin , Kevin Lang , Edo Liberty

Sketches are probabilistic data structures that can provide approximate results within mathematically proven error bounds while using orders of magnitude less memory than traditional approaches. They are tailored for streaming data analysis…

Data Structures and Algorithms · Computer Science 2019-03-05 Fatih Taşyaran , Kerem Yıldırır , Kamer Kaya , Mustafa Kemal Taş

We present a novel approach for the problem of frequency estimation in data streams that is based on optimization and machine learning. Contrary to state-of-the-art streaming frequency estimation algorithms, which heavily rely on random…

Data Structures and Algorithms · Computer Science 2022-07-19 Dimitris Bertsimas , Vassilis Digalakis

Modern stream processing systems often need to track the frequency of distinct keys in a data stream in real-time. Since maintaining exact counts can require a prohibitive amount of memory, many applications rely on compact, probabilistic…

Data Structures and Algorithms · Computer Science 2026-04-29 Navid Eslami , Ioana O. Bercea , Rasmus Pagh , Niv Dayan

In data stream applications, one of the critical issues is to estimate the frequency of each item in the specific multiset. The multiset means that each item in this set can appear multiple times. The data streams in many applications are…

Data Structures and Algorithms · Computer Science 2020-01-07 Ning Li

Frequency estimation of elements is an important task for summarizing data streams and machine learning applications. The problem is often addressed by using streaming algorithms with sublinear space data structures. These algorithms allow…

Data Structures and Algorithms · Computer Science 2022-04-05 Nikita Seleznev , Senthil Kumar , C. Bayan Bruss

Many streaming algorithms provide only a high-probability relative approximation. These two relaxations, of allowing approximation and randomization, seem necessary -- for many streaming problems, both relaxations must be employed…

Data Structures and Algorithms · Computer Science 2023-05-16 Vladimir Braverman , Robert Krauthgamer , Aditya Krishnan , Shay Sapir

Computing the approximate quantiles or ranks of a stream is a fundamental task in data monitoring. Given a stream of elements $x_1, x_2, \dots, x_n$ and a query $x$, a relative-error quantile estimation algorithm can estimate the rank of…

Data Structures and Algorithms · Computer Science 2024-11-05 Elena Gribelyuk , Pachara Sawettamalya , Hongxun Wu , Huacheng Yu

We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. The algorithm receives the rows of a large matrix $A \in \R^{n \times m}$ one after the other in a streaming fashion. It maintains…

Data Structures and Algorithms · Computer Science 2012-07-12 Edo Liberty

A fundamental question in streaming complexity is whether every space-efficient turnstile algorithm is implicitly a linear sketch. The landmark work of Li, Nguyen, and Woodruff [LNW14] established an equivalence between the two, but their…

Data Structures and Algorithms · Computer Science 2026-04-27 Cheng Jiang , Yinchen Liu , Huacheng Yu

Given a stream $p_1, \ldots, p_m$ of items from a universe $\mathcal{U}$, which, without loss of generality we identify with the set of integers $\{1, 2, \ldots, n\}$, we consider the problem of returning all $\ell_2$-heavy hitters, i.e.,…

Data Structures and Algorithms · Computer Science 2015-11-03 Vladimir Braverman , Stephen R. Chestnut , Nikita Ivkin , David P. Woodruff

The efficient estimation of frequency moments of a data stream in one-pass using limited space and time per item is one of the most fundamental problem in data stream processing. An especially important estimation is to find the number of…

Data Structures and Algorithms · Computer Science 2010-10-29 Gokarna Sharma , Costas Busch , Srikanta Tirthapura

This paper develops conformal inference methods to construct a confidence interval for the frequency of a queried object in a very large discrete data set, based on a sketch with a lower memory footprint. This approach requires no knowledge…

Methodology · Statistics 2023-08-17 Matteo Sesia , Stefano Favaro , Edgar Dobriban

The sliding window model of computation captures scenarios in which data are continually arriving in the form of a stream, and only the most recent $w$ items are used for analysis. In this setting, an algorithm needs to accurately track…

Cryptography and Security · Computer Science 2024-06-13 Yiping Wang , Yanhao Wang , Cen Chen

Estimating the frequency of items on the high-volume, fast data stream has been extensively studied in many areas, such as database and network measurement. Traditional sketches provide only coarse estimates under strict memory constraints.…

Machine Learning · Computer Science 2026-03-26 Xinyu Yuan , Yan Qiao , Meng Li , Zhenchun Wei , Cuiying Feng , Zonghui Wang , Wenzhi Chen

Computing space-efficient summary, or \textit{a.k.a. sketches}, of large data, is a central problem in the streaming algorithm. Such sketches are used to answer \textit{post-hoc} queries in several data analytics tasks. The algorithm for…

Machine Learning · Computer Science 2022-03-07 Rameshwar Pratap , Bhisham Dev Verma , Raghav Kulkarni
‹ Prev 1 2 3 10 Next ›