English
Related papers

Related papers: Sketching Linear Classifiers over Data Streams

200 papers

Network stream mining is fundamental to many network operations. Sketches, as compact data structures that offer low memory overhead with bounded accuracy, have emerged as a promising solution for network stream mining. Recent studies…

Networking and Internet Architecture · Computer Science 2025-02-12 Yuanpeng Li , Zhen Xu , Zongwei Lv , Yannan Hu , Yong Cui , Tong Yang

Learning parameters from voluminous data can be prohibitive in terms of memory and computational requirements. We propose a "compressive learning" framework where we estimate model parameters from a sketch of the training data. This sketch…

Machine Learning · Computer Science 2017-05-08 Nicolas Keriven , Anthony Bourrier , Rémi Gribonval , Patrick Pérez

A sketch is a probabilistic data structure used to record frequencies of items in a multi-set. Sketches are widely used in various fields, especially those that involve processing and storing data streams. In streaming applications with…

Data Structures and Algorithms · Computer Science 2017-02-08 Tong Yang , Lingtong Liu , Yibo Yan , Muhammad Shahzad , Yulong Shen , Xiaoming Li , Bin Cui , Gaogang Xie

Estimating cardinality, i.e., the number of distinct elements, of a data stream is a fundamental problem in areas like databases, computer networks, and information retrieval. This study delves into a broader scenario where each element…

Databases · Computer Science 2024-06-28 Yiyan Qi , Rundong Li , Pinghui Wang , Yufang Sun , Rui Xing

The rapid growth of large language models (LLMs) has outpaced the memory constraints of edge devices, necessitating extreme weight compression beyond the 1-bit limit. While quantization reduces model size, it is fundamentally limited to 1…

Machine Learning · Computer Science 2025-06-24 Sunan Zou , Ziyun Zhang , Xueting Sun , Guojie Luo

Stream monitoring is fundamental in many data stream applications, such as financial data trackers, security, anomaly detection, and load balancing. In that respect, quantiles are of particular interest, as they often capture the user's…

Data Structures and Algorithms · Computer Science 2022-01-07 Rana Shahout , Roy Friedman , Ran Ben Basat

Large, distributed data streams are now ubiquitous. High-accuracy sketches with low memory overhead have become the de facto method for analyzing this data. For instance, if we wish to group data by some label and report the largest counts…

Data Structures and Algorithms · Computer Science 2024-02-14 Homin K. Lee , Charles Masson

Graph streams represent data interactions in real applications. The mining of graph streams plays an important role in network security, social network analysis, and traffic control, among others. However, the sheer volume and high dynamics…

Databases · Computer Science 2023-04-07 Yiling Zeng , Chunyao Song , Yuhan Li , Tingjian Ge

Estimating the frequency of items on the high-volume, fast data stream has been extensively studied in many areas, such as database and network measurement. Traditional sketches provide only coarse estimates under strict memory constraints.…

Machine Learning · Computer Science 2026-03-26 Xinyu Yuan , Yan Qiao , Meng Li , Zhenchun Wei , Cuiying Feng , Zonghui Wang , Wenzhi Chen

Convolutional neural networks (CNNs) with deep architectures have substantially advanced the state-of-the-art in computer vision tasks. However, deep networks are typically resource-intensive and thus difficult to be deployed on mobile…

Neural and Evolutionary Computing · Computer Science 2017-06-08 Yiwen Guo , Anbang Yao , Hao Zhao , Yurong Chen

Recently, Bessa et al. (PODS 2023) showed that sketches based on coordinated weighted sampling theoretically and empirically outperform popular linear sketching methods like Johnson-Lindentrauss projection and CountSketch for the ubiquitous…

Databases · Computer Science 2024-08-23 Majid Daliri , Juliana Freire , Christopher Musco , Aécio Santos , Haoxiang Zhang

We present a new approach for computing compact sketches that can be used to approximate the inner product between pairs of high-dimensional vectors. Based on the Weighted MinHash algorithm, our approach admits strong accuracy guarantees…

Summaries of massive data sets support approximate query processing over the original data. A basic aggregate over a set of records is the weight of subpopulations specified as a predicate over records' attributes. Bottom-k sketches are a…

Databases · Computer Science 2008-02-26 Edith Cohen , Haim Kaplan

Sampling of signals belonging to a low-dimensional subspace has well-documented merits for dimensionality reduction, limited memory storage, and online processing of streaming network data. When the subspace is known, these signals can be…

Information Theory · Computer Science 2019-11-26 Fernando Gama , Antonio G. Marques , Gonzalo Mateos , Alejandro Ribeiro

In this paper, we address the problem of learning compact similarity-preserving embeddings for massive high-dimensional streams of data in order to perform efficient similarity search. We present a new online method for computing binary…

Machine Learning · Computer Science 2018-02-12 Anne Morvan , Antoine Souloumiac , Cédric Gouy-Pailler , Jamal Atif

In this paper, we consider the problem of estimating the distance between any two large data streams in small- space constraint. This problem is of utmost importance in data intensive monitoring applications where input streams are…

Data Structures and Algorithms · Computer Science 2012-08-01 Emmanuelle Anceaume , Yann Busnel

This article considers "compressive learning," an approach to large-scale machine learning where datasets are massively compressed before learning (e.g., clustering, classification, or regression) is performed. In particular, a "sketch" is…

Matrix sketching is a recently developed data compression technique. An input matrix A is efficiently approximated with a smaller matrix B, so that B preserves most of the properties of A up to some guaranteed approximation ratio. In so…

Machine Learning · Statistics 2019-12-03 Roberta Falcone , Angela Montanari , Laura Anderlucci

Kernel density estimation is a simple and effective method that lies at the heart of many important machine learning applications. Unfortunately, kernel methods scale poorly for large, high dimensional datasets. Approximate kernel density…

Data Structures and Algorithms · Computer Science 2019-12-06 Benjamin Coleman , Anshumali Shrivastava

Recent work has explored transforming data sets into smaller, approximate summaries in order to scale Bayesian inference. We examine a related problem in which the parameters of a Bayesian model are very large and expensive to store in…

Machine Learning · Computer Science 2018-10-03 Joseph Tassarotti , Jean-Baptiste Tristan , Michael Wick
‹ Prev 1 2 3 10 Next ›