English
Related papers

Related papers: A statistical analysis of probabilistic counting a…

200 papers

Cardinality estimation algorithms receive a stream of elements whose order might be arbitrary, with possible repetitions, and return the number of distinct elements. Such algorithms usually seek to minimize the required storage and…

Data Structures and Algorithms · Computer Science 2015-08-26 Reuven Cohen , Liran Katzir , Aviv Yehezkel

Cardinality estimation algorithms receive a stream of elements, with possible repetitions, and return the number of distinct elements in the stream. Such algorithms seek to minimize the required memory and CPU resource consumption at the…

Networking and Internet Architecture · Computer Science 2019-03-15 Reuven Cohen , Yuval Nezri

This paper presents new methods to estimate the cardinalities of data sets recorded by HyperLogLog sketches. A theoretically motivated extension to the original estimator is presented that eliminates the bias for small and large…

Data Structures and Algorithms · Computer Science 2017-02-27 Otmar Ertl

Structured high-cardinality data arises in many domains, and poses a major challenge for both modeling and inference. Graphical models are a popular approach to modeling structured data but they are unsuitable for high-cardinality…

Data Structures and Algorithms · Computer Science 2016-07-19 Branislav Kveton , Hung Bui , Mohammad Ghavamzadeh , Georgios Theocharous , S. Muthukrishnan , Siqi Sun

We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls…

Optimization and Control · Mathematics 2011-08-30 Alexandre d'Aspremont

In recent years there has been a growing interest in developing "streaming algorithms" for efficient processing and querying of continuous data streams. These algorithms seek to provide accurate results while minimizing the required storage…

Data Structures and Algorithms · Computer Science 2016-06-06 Reuven Cohen , Liran Katzir , Aviv Yehezkel

Randomized algorithms, such as randomized sketching or stochastic optimization, are a promising approach to ease the computational burden in analyzing large datasets. However, randomized algorithms also produce non-deterministic outputs,…

Methodology · Statistics 2025-05-13 Zhixiang Zhang , Sokbae Lee , Edgar Dobriban

Online monitoring user cardinalities (or degrees) in graph streams is fundamental for many applications. For example in a bipartite graph representing user-website visiting activities, user cardinalities (the number of distinct visited…

Data Structures and Algorithms · Computer Science 2018-11-27 Pinghui Wang , Peng Jia , Xiangliang Zhang , Jing Tao , Xiaohong Guan , Don Towsley

Sketch-based streaming algorithms allow efficient processing of big data. These algorithms use small fixed-size storage to store a summary ("sketch") of the input data, and use probabilistic algorithms to estimate the desired quantity.…

Databases · Computer Science 2016-11-08 Reuven Cohen , Liran Katzir , Aviv Yehezkel

The ability to preserve user privacy and anonymity is important. One of the safest ways to maintain privacy is to avoid storing personally identifiable information (PII), which poses a challenge for maintaining useful user statistics.…

Cryptography and Security · Computer Science 2019-10-17 Lu Yu , Oluwakemi Hambolu , Yu Fu , Jon Oakley , Richard R. Brooks

Many streaming algorithms provide only a high-probability relative approximation. These two relaxations, of allowing approximation and randomization, seem necessary -- for many streaming problems, both relaxations must be employed…

Data Structures and Algorithms · Computer Science 2023-05-16 Vladimir Braverman , Robert Krauthgamer , Aditya Krishnan , Shay Sapir

We study two classes of summary-based cardinality estimators that use statistics about input relations and small-size joins in the context of graph database management systems: (i) optimistic estimators that make uniformity and conditional…

Databases · Computer Science 2021-05-20 Jeremy Chen , Yuqing Huang , Mushi Wang , Semih Salihoglu , Ken Salem

The amount of data coming from different sources such as IoT-sensors, social networks, cellular networks, has increased exponentially during the last few years. Probabilistic Data Structures (PDS) are efficient alternatives to deterministic…

Data Structures and Algorithms · Computer Science 2022-11-02 Remy Scholler , Jean-Francois Couchot , Oumaima Alaoui-Ismaili , Denis Renaud , Eric Ballot

Flow cardinality estimation is the problem of estimating the number of distinct elements in a data flow, often with a stringent memory constraint. It has wide applications in network traffic measurement and in database systems. The virtual…

Information Theory · Computer Science 2018-12-10 Zeyu Zhou

Estimating cardinality, i.e., the number of distinct elements, of a data stream is a fundamental problem in areas like databases, computer networks, and information retrieval. This study delves into a broader scenario where each element…

Databases · Computer Science 2024-06-28 Yiyan Qi , Rundong Li , Pinghui Wang , Yufang Sun , Rui Xing

In this paper we consider the problem of maximizing a non-negative submodular function subject to a cardinality constraint in the data stream model. Previously, the best known algorithm for this problem was a $5.828$-approximation…

Data Structures and Algorithms · Computer Science 2019-06-27 Naor Alaluf , Moran Feldman

Sketching is a probabilistic data compression technique that has been largely developed in the computer science community. Numerical operations on big datasets can be intolerably slow; sketching algorithms address this issue by generating a…

Methodology · Statistics 2019-04-04 Daniel Ahfock , William J. Astle , Sylvia Richardson

We deliver a call to arms for probabilistic numerical methods: algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations. Such…

Numerical Analysis · Mathematics 2016-02-17 Philipp Hennig , Michael A Osborne , Mark Girolami

Sketching algorithms use random projections to generate a smaller sketched data set, often for the purposes of modelling. Complete and partial sketch regression estimates can be constructed using information from only the sketched data set…

Methodology · Statistics 2023-06-07 R. P. Browne , J. L. Andrews

We consider the problem of monotone, submodular maximization over a ground set of size $n$ subject to cardinality constraint $k$. For this problem, we introduce the first deterministic algorithms with linear time complexity; these…

Data Structures and Algorithms · Computer Science 2021-03-09 Alan Kuhnle
‹ Prev 1 2 3 10 Next ›