English
Related papers

Related papers: Stream Sampling for Frequency Cap Statistics

200 papers

With the recent bloom of data, there is a huge surge in threats against individuals' private information. Various techniques for optimizing privacy-preserving data analysis are at the focus of research in the recent years. In this paper, we…

Cryptography and Security · Computer Science 2022-11-11 Sayan Biswas , Graham Cormode , Carsten Maple

We consider massive distributed datasets that consist of elements modeled as key-value pairs and the task of computing statistics or aggregates where the contribution of each key is weighted by a function of its frequency (sum of values of…

Data Structures and Algorithms · Computer Science 2019-12-24 Edith Cohen , Ofir Geri

In order to remain competitive, Internet companies collect and analyse user data for the purpose of improving user experiences. Frequency estimation is a widely used statistical tool which could potentially conflict with the relevant…

Cryptography and Security · Computer Science 2021-04-14 Mengmeng Yang , Ivan Tjuawinata , Kwok-Yan Lam , Tianqing Zhu , Jun Zhao

In data stream applications, one of the critical issues is to estimate the frequency of each item in the specific multiset. The multiset means that each item in this set can appear multiple times. The data streams in many applications are…

Data Structures and Algorithms · Computer Science 2020-01-07 Ning Li

Due to recent advances in data collection techniques, massive amounts of data are being collected at an extremely fast pace. Also, these data are potentially unbounded. Boundless streams of data collected from sensors, equipments, and other…

Databases · Computer Science 2012-03-12 T Soni Madhulatha

We present a novel approach for the problem of frequency estimation in data streams that is based on optimization and machine learning. Contrary to state-of-the-art streaming frequency estimation algorithms, which heavily rely on random…

Data Structures and Algorithms · Computer Science 2022-07-19 Dimitris Bertsimas , Vassilis Digalakis

We present a numerically robust, computationally efficient approach for non-I.I.D. data stream sampling in federated client systems, where resources are limited and labeled data for local model adaptation is sparse and expensive. The…

Machine Learning · Computer Science 2024-09-02 Manuel Röder , Frank-Michael Schleif

We introduce and study a new data sketch for processing massive datasets. It addresses two common problems: 1) computing a sum given arbitrary filter conditions and 2) identifying the frequent items or heavy hitters in a data set. For the…

Computation · Statistics 2017-09-14 Daniel Ting

Common datasets have the form of elements with keys (e.g., transactions and products) and the goal is to perform analytics on the aggregated form of key and frequency pairs. A weighted sample of keys by (a function of) frequency is a highly…

Machine Learning · Computer Science 2021-04-01 Edith Cohen , Ofir Geri , Tamas Sarlos , Uri Stemmer

We study the fundamental problem of frequency estimation under both privacy and communication constraints, where the data is distributed among $k$ parties. We consider two application scenarios: (1) one-shot, where the data is static and…

Cryptography and Security · Computer Science 2021-06-01 Ziyue Huang , Yuan Qiu , Ke Yi , Graham Cormode

Most sampling techniques for online social networks (OSNs) are based on a particular sampling method on a single graph, which is referred to as a statistics. However, various realizing methods on different graphs could possibly be used in…

Social and Information Networks · Computer Science 2015-12-21 Xin Wang , Richard T. B. Ma , Yinlong Xu , Zhipeng Li

Big data streams are possibly one of the most essential underlying notions. However, data streams are often challenging to handle owing to their rapid pace and limited information lifetime. It is difficult to collect and communicate stream…

Machine Learning · Computer Science 2022-03-03 Christos Karras , Aristeidis Karras , Spyros Sioutas

In the era of big data, graph sampling is indispensable in many settings. Existing sampling methods are mostly designed for static graphs, and aim to preserve basic structural properties of the original graph (such as degree distribution,…

Social and Information Networks · Computer Science 2018-02-07 Sandipan Sikdar , Tanmoy Chakraborty , Soumya Sarkar , Niloy Ganguly , Animesh Mukherjee

We present the first feasible method for sampling a dynamic data stream with deletions, where the sample consists of pairs $(k,C_k)$ of a value $k$ and its exact total count $C_k$. Our algorithms are for both Strict Turnstile data streams…

Data Structures and Algorithms · Computer Science 2012-09-26 Neta Barkay , Ely Porat , Bar Shalem

We consider the problem of sampling from data defined on the nodes of a weighted graph, where the edge weights capture the data correlation structure. As shown recently, using spectral graph theory one can define a cut-off frequency for the…

Information Theory · Computer Science 2014-11-13 Ilan Shomorony , A. Salman Avestimehr

Sampling is a standard approach in big-graph analytics; the goal is to efficiently estimate the graph properties by consulting a sample of the whole population. A perfect sample is assumed to mirror every property of the whole population.…

Social and Information Networks · Computer Science 2014-03-18 Nesreen K. Ahmed , Nick Duffield , Jennifer Neville , Ramana Kompella

One of the most common statistics computed over data elements is the number of distinct keys. A thread of research pioneered by Flajolet and Martin three decades ago culminated in the design of optimal approximate counting sketches, which…

Data Structures and Algorithms · Computer Science 2017-02-27 Edith Cohen

Streaming analytics are essential in a large range of applications, including databases, networking, and machine learning. To optimize performance, practitioners are increasingly offloading such analytics to network nodes such as switches.…

Networking and Internet Architecture · Computer Science 2025-03-19 Jonatan Langlet , Peiqing Chen , Michael Mitzenmacher , Ran Ben Basat , Zaoxing Liu , Gianni Antichi

Given a stream of data, a typical approach in streaming algorithms is to design a sophisticated algorithm with small memory that computes a specific statistic over the streaming data. Usually, if one wants to compute a different statistic…

Data Structures and Algorithms · Computer Science 2014-08-13 Vladimir Braverman , Rafail Ostrovsky , Alan Roytman

As a representative sequential pattern mining problem, counting the frequency of serial episodes from a streaming sequence has drawn continuous attention in academia due to its wide application in practice, e.g., telecommunication alarms,…

Data Structures and Algorithms · Computer Science 2018-01-30 Hui Li , Sizhe Peng , Jian Li , Jingjing Li , Jiangtao Cui , Jianfeng Ma
‹ Prev 1 2 3 10 Next ›