Related papers: Binary Coding in Stream

Streaming Binary Sketching based on Subspace Tracking and Diagonal Uniformization

In this paper, we address the problem of learning compact similarity-preserving embeddings for massive high-dimensional streams of data in order to perform efficient similarity search. We present a new online method for computing binary…

Machine Learning · Computer Science 2018-02-12 Anne Morvan , Antoine Souloumiac , Cédric Gouy-Pailler , Jamal Atif

Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching

Big data problems frequently require processing datasets in a streaming fashion, either because all data are available at once but collectively are larger than available memory or because the data intrinsically arrive one data point at a…

Computation · Statistics 2018-08-08 Andrea Giovannucci , Victor Minden , Cengiz Pehlevan , Dmitri B. Chklovskii

Efficient Sketching Algorithm for Sparse Binary Data

Recent advancement of the WWW, IOT, social network, e-commerce, etc. have generated a large volume of data. These datasets are mostly represented by high dimensional and sparse datasets. Many fundamental subroutines of common data analytic…

Information Retrieval · Computer Science 2019-10-11 Rameshwar Pratap , Debajyoti Bera , Karthik Revanuru

Spiking Neural Networks Through the Lens of Streaming Algorithms

We initiate the study of biological neural networks from the perspective of streaming algorithms. Like computers, human brains suffer from memory limitations which pose a significant obstacle when processing large scale and dynamically…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-06 Yael Hitron , Cameron Musco , Merav Parter

Optimality of Linear Sketching under Modular Updates

We study the relation between streaming algorithms and linear sketching algorithms, in the context of binary updates. We show that for inputs in $n$ dimensions, the existence of efficient streaming algorithms which can process $\Omega(n^2)$…

Computational Complexity · Computer Science 2018-09-25 Kaave Hosseini , Shachar Lovett , Grigory Yaroslavtsev

Tensor-Based Sketching Method for the Low-Rank Approximation of Data Streams

Low-rank approximation in data streams is a fundamental and significant task in computing science, machine learning and statistics. Multiple streaming algorithms have emerged over years and most of them are inspired by randomized…

Data Structures and Algorithms · Computer Science 2022-09-30 Cuiyu Liu , Chuanfu Xiao , Mingshuo Ding , Chao Yang

Streaming supercomputing needs workflow-enabled programming-in-the-large

This is a position paper, submitted to the Future Online Analysis Platform Workshop (https://press3.mcs.anl.gov/futureplatform/), which argues that simple data analysis applications are common today, but future online supercomputing…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-02-27 Justin M Wozniak , Jonathan Ozik , Daniel S. Katz , Michael Wilde

Parallel and Streaming Algorithms for K-Core Decomposition

The $k$-core decomposition is a fundamental primitive in many machine learning and data mining applications. We present the first distributed and the first streaming algorithms to compute and maintain an approximate $k$-core decomposition…

Data Structures and Algorithms · Computer Science 2018-11-27 Hossein Esfandiari , Silvio Lattanzi , Vahab Mirrokni

A Streaming Algorithm for Crowdsourced Data Classification

We propose a streaming algorithm for the binary classification of data based on crowdsourcing. The algorithm learns the competence of each labeller by comparing her labels to those of other labellers on the same tasks and uses this…

Machine Learning · Statistics 2016-02-24 Thomas Bonald , Richard Combes

Simple and Deterministic Matrix Sketching

We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. The algorithm receives the rows of a large matrix $A \in \R^{n \times m}$ one after the other in a streaming fashion. It maintains…

Data Structures and Algorithms · Computer Science 2012-07-12 Edo Liberty

Fast Exact Search in Hamming Space with Multi-Index Hashing

There is growing interest in representing image data and feature descriptors using compact binary codes for fast near neighbor search. Although binary codes are motivated by their use as direct indices (addresses) into a hash table, codes…

Computer Vision and Pattern Recognition · Computer Science 2014-04-28 Mohammad Norouzi , Ali Punjani , David J. Fleet

Streaming Algorithms for Bin Packing and Vector Scheduling

Problems involving the efficient arrangement of simple objects, as captured by bin packing and makespan scheduling, are fundamental tasks in combinatorial optimization. These are well understood in the traditional online and offline cases,…

Data Structures and Algorithms · Computer Science 2026-01-27 Graham Cormode , Pavel Veselý

Streaming Hypergraph Partitioning Algorithms on Limited Memory Environments

Many well-known, real-world problems involve dynamic data which describe the relationship among the entities. Hypergraphs are powerful combinatorial structures that are frequently used to model such data. For many of today's data-centric…

Data Structures and Algorithms · Computer Science 2021-03-10 Fatih Taşyaran , Berkay Demireller , Kamer Kaya , Bora Uçar

Learning-Augmented Streaming Codes are Approximately Optimal for Variable-Size Messages

Real-time streaming communication requires a high quality of service despite contending with packet loss. Streaming codes are a class of codes best suited for this setting. A key challenge for streaming codes is that they operate in an…

Information Theory · Computer Science 2022-05-18 Michael Rudow , K. V. Rashmi

Fast and Quality-Guaranteed Data Streaming in Resource-Constrained Sensor Networks

In many emerging applications, data streams are monitored in a network environment. Due to limited communication bandwidth and other resource constraints, a critical and practical demand is to online compress data streams continuously with…

Data Structures and Algorithms · Computer Science 2008-12-01 Emad Soroush , Kui Wu , Jian Pei

Achieving Approximate Soft Clustering in Data Streams

In recent years, data streaming has gained prominence due to advances in technologies that enable many applications to generate continuous flows of data. This increases the need to develop algorithms that are able to efficiently process…

Data Structures and Algorithms · Computer Science 2015-03-20 Vaneet Aggarwal , Shankar Krishnan

Sketching and Streaming for Dictionary Compression

We initiate the study of sub-linear sketching and streaming techniques for estimating the output size of common dictionary compressors such as Lempel-Ziv '77, the run-length Burrows-Wheeler transform, and grammar compression. To this end,…

Data Structures and Algorithms · Computer Science 2024-08-20 Ruben Becker , Matteo Canton , Davide Cenzato , Sung-Hwan Kim , Bojana Kodric , Nicola Prezza

Online Machine Learning in Big Data Streams

The area of online machine learning in big data streams covers algorithms that are (1) distributed and (2) work from data streams with only a limited possibility to store past data. The first requirement mostly concerns software…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-19 András A. Benczúr , Levente Kocsis , Róbert Pálovics

Online Hashing with Efficient Updating of Binary Codes

Online hashing methods are efficient in learning the hash functions from the streaming data. However, when the hash functions change, the binary codes for the database have to be recomputed to guarantee the retrieval accuracy. Recomputing…

Data Structures and Algorithms · Computer Science 2019-12-05 Zhenyu Weng , Yuesheng Zhu

Streaming Similarity Self-Join

We introduce and study the problem of computing the similarity self-join in a streaming context (SSSJ), where the input is an unbounded stream of items arriving continuously. The goal is to find all pairs of items in the stream whose…

Databases · Computer Science 2016-03-09 Gianmarco De Francisci Morales , Aristides Gionis