Related papers: Downstream: efficient cross-platform algorithms fo…

Structured Downsampling for Fast, Memory-efficient Curation of Online Data Streams

Operations over data streams typically hinge on efficient mechanisms to aggregate or summarize history on a rolling basis. For high-volume data steams, it is critical to manage state in a manner that is fast and memory efficient --…

Data Structures and Algorithms · Computer Science 2024-09-24 Matthew Andres Moreno , Luis Zaman , Emily Dolson

Algorithms for Efficient, Compact Online Data Stream Curation

Data stream algorithms tackle operations on high-volume sequences of read-once data items. Data stream scenarios include inherently real-time systems like sensor networks and financial markets. They also arise in purely-computational…

Data Structures and Algorithms · Computer Science 2024-03-04 Matthew Andres Moreno , Santiago Rodriguez Papa , Emily Dolson

Stream Fusion, to Completeness

Stream processing is mainstream (again): Widely-used stream libraries are now available for virtually all modern OO and functional languages, from Java to C# to Scala to OCaml to Haskell. Yet expressivity and performance are still lacking.…

Programming Languages · Computer Science 2016-12-21 Oleg Kiselyov , Aggelos Biboudis , Nick Palladinos , Yannis Smaragdakis

Data Stream Clustering: A Review

Number of connected devices is steadily increasing and these devices continuously generate data streams. Real-time processing of data streams is arousing interest despite many challenges. Clustering is one of the most suitable methods for…

Machine Learning · Computer Science 2020-07-22 Alaettin Zubaroğlu , Volkan Atalay

FastFlow: Efficient Parallel Streaming Applications on Multi-core

Shared memory multiprocessors come back to popularity thanks to rapid spreading of commodity multi-core architectures. As ever, shared memory programs are fairly easy to write and quite hard to optimise; providing multi-core programmers…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-09-10 Marco Aldinucci , Massimo Torquati , Massimiliano Meneghin

River: machine learning for streaming data in Python

River is a machine learning library for dynamic data streams and continual learning. It provides multiple state-of-the-art learning methods, data generators/transformers, performance metrics and evaluators for different stream learning…

Machine Learning · Computer Science 2020-12-10 Jacob Montiel , Max Halford , Saulo Martiello Mastelini , Geoffrey Bolmier , Raphael Sourty , Robin Vaysse , Adil Zouitine , Heitor Murilo Gomes , Jesse Read , Talel Abdessalem , Albert Bifet

StreamTensor: Make Tensors Stream in Dataflow Accelerators for LLMs

Efficient execution of deep learning workloads on dataflow architectures is crucial for overcoming memory bottlenecks and maximizing performance. While streaming intermediate results between computation kernels can significantly improve…

Hardware Architecture · Computer Science 2025-09-24 Hanchen Ye , Deming Chen

Managing caching strategies for stream reasoning with reinforcement learning

Efficient decision-making over continuously changing data is essential for many application domains such as cyber-physical systems, industry digitalization, etc. Modern stream reasoning frameworks allow one to model and solve various…

Artificial Intelligence · Computer Science 2020-08-10 Carmine Dodaro , Thomas Eiter , Paul Ogris , Konstantin Schekotihin

Context-aware Failure-oblivious Computing as a Means of Preventing Buffer Overflows

In languages like C, buffer overflows are widespread. A common mitigation technique is to use tools that detect them during execution and abort the program to prevent the leakage of data or the diversion of control flow. However, for server…

Cryptography and Security · Computer Science 2018-11-26 Manuel Rigger , Daniel Pekarek , Hanspeter Mössenböck

Class-Incremental Experience Replay for Continual Learning under Concept Drift

Modern machine learning systems need to be able to cope with constantly arriving and changing data. Two main areas of research dealing with such scenarios are continual learning and data stream mining. Continual learning focuses on…

Machine Learning · Computer Science 2021-04-27 Łukasz Korycki , Bartosz Krawczyk

Parallel Streaming Random Sampling

This paper investigates parallel random sampling from a potentially-unending data stream whose elements are revealed in a series of element sequences (minibatches). While sampling from a stream was extensively studied sequentially, not much…

Data Structures and Algorithms · Computer Science 2019-06-11 Kanat Tangwongsan , Srikanta Tirthapura

Spinning Fast Iterative Data Flows

Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature of many analysis and machine learning algorithms, however, is still a challenge for current systems. While certain types of bulk…

Databases · Computer Science 2012-08-02 Stephan Ewen , Kostas Tzoumas , Moritz Kaufmann , Volker Markl

ZipFlow: a Compiler-based Framework to Unleash Compressed Data Movement for Modern GPUs

In GPU-accelerated data analytics, the overhead of data transfer from CPU to GPU becomes a performance bottleneck when the data scales beyond GPU memory capacity due to the limited PCIe bandwidth. Data compression has come to rescue for…

Databases · Computer Science 2026-02-10 Gwangoo Yeo , Zhiyang Shen , Wei Cui , Matteo Interlandi , Rathijit Sen , Bailu Ding , Qi Chen , Minsoo Rhu

MOStream: A Modular and Self-Optimizing Data Stream Clustering Algorithm

Data stream clustering is a critical operation in various real-world applications, ranging from the Internet of Things (IoT) to social media and financial systems. Existing data stream clustering algorithms, while effective to varying…

Databases · Computer Science 2024-06-18 Zhengru Wang , Xin Wang , Shuhao Zhang

A Clustering-based Framework for Classifying Data Streams

The non-stationary nature of data streams strongly challenges traditional machine learning techniques. Although some solutions have been proposed to extend traditional machine learning techniques for handling data streams, these approaches…

Machine Learning · Computer Science 2021-06-23 Xuyang Yan , Abdollah Homaifar , Mrinmoy Sarkar , Abenezer Girma , Edward Tunstel

Overview of streaming-data algorithms

Due to recent advances in data collection techniques, massive amounts of data are being collected at an extremely fast pace. Also, these data are potentially unbounded. Boundless streams of data collected from sensors, equipments, and other…

Databases · Computer Science 2012-03-12 T Soni Madhulatha

Distributed Data Stream Processing and Edge Computing: A Survey on Resource Elasticity and Future Directions

Under several emerging application scenarios, such as in smart cities, operational monitoring of large infrastructure, wearable assistance, and Internet of Things, continuous data streams must be processed under very short delays. Several…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-12-05 Marcos Dias de Assuncao , Alexandre da Silva Veith , Rajkumar Buyya

Pipeflow: An Efficient Task-Parallel Pipeline Programming Framework using Modern C++

Pipeline is a fundamental parallel programming pattern. Mainstream pipeline programming frameworks count on data abstractions to perform pipeline scheduling. This design is convenient for data-centric pipeline applications but inefficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-02-03 Cheng-Hsiang Chiu , Tsung-Wei Huang , Zizheng Guo , Yibo Lin

StreamBlocks: A compiler for heterogeneous dataflow computing (technical report)

To increase performance and efficiency, systems use FPGAs as reconfigurable accelerators. A key challenge in designing these systems is partitioning computation between processors and an FPGA. An appropriate division of labor may be…

Hardware Architecture · Computer Science 2021-07-21 Endri Bezati , Mahyar Emami , Jörn Janneck , James Larus

Hardware-Conscious Stream Processing: A Survey

Data stream processing systems (DSPSs) enable users to express and run stream applications to continuously process data streams. To achieve real-time data analytics, recent researches keep focusing on optimizing the system latency and…

Databases · Computer Science 2024-06-18 Shuhao Zhang , Feng Zhang , Yingjun Wu , Bingsheng He , Paul Johns