Related papers: Lightweight Asynchronous Snapshots for Distributed…

Asynchronous Complex Analytics in a Distributed Dataflow Architecture

Scalable distributed dataflow systems have recently experienced widespread adoption, with commodity dataflow engines such as Hadoop and Spark, and even commodity SQL engines routinely supporting increasingly sophisticated analytics tasks…

Databases · Computer Science 2015-10-27 Joseph E. Gonzalez , Peter Bailis , Michael I. Jordan , Michael J. Franklin , Joseph M. Hellerstein , Ali Ghodsi , Ion Stoica

Asynchronous Checkpoint for Eventually Consistent Databases

We focus on the problem of checkpointing (or taking a snapshot) in fully replicated eventually consistent distributed databases. In particular, we consider the problem of taking Distributed Transaction-Consistent Snapshots (DTCS). A typical…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-19 Raaghav Ravishankar , Sandeep Kulkarni , Nitin H Vaidya

Wireless Networks with Cache-Enabled and Backhaul-Limited Aerial Base Stations

Use of aerial base stations (ABSs) is a promising approach to enhance the agility and flexibility of future wireless networks. ABSs can improve the coverage and/or capacity of a network by moving supply towards demand. Deploying ABSs in a…

Networking and Internet Architecture · Computer Science 2020-09-09 Elham Kalantari , Halim Yanikomeroglu , Abbas Yongacoglu

AIR: A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing

Distributed Stream Processing Systems (DSPSs) are among the currently most emerging topics in data management, with applications ranging from real-time event monitoring to processing complex dataflow programs and big data analytics. The…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-06 Vinu E. Venugopal , Martin Theobald , Samira Chaychi , Amal Tawakuli

ASAP: Asynchronous Approximate Data-Parallel Computation

Emerging workloads, such as graph processing and machine learning are approximate because of the scale of data involved and the stochastic nature of the underlying algorithms. These algorithms are often distributed over multiple machines…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-12-28 Asim Kadav , Erik Kruus

Benchmarking Distributed Stream Data Processing Systems

The need for scalable and efficient stream analysis has led to the development of many open-source streaming data processing systems (SDPSs) with highly diverging capabilities and performance characteristics. While first initiatives try to…

Databases · Computer Science 2019-06-27 Jeyhun Karimov , Tilmann Rabl , Asterios Katsifodimos , Roman Samarev , Henri Heiskanen , Volker Markl

A Utilization Model for Optimization of Checkpoint Intervals in Distributed Stream Processing Systems

State-of-the-art distributed stream processing systems such as Apache Flink and Storm have recently included checkpointing to provide fault-tolerance for stateful applications. This is a necessary eventuality as these systems head into the…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-21 Sachini Jayasekara , Aaron Harwood , Shanika Karunasekera

ABS: Adaptive Bounded Staleness Converges Faster and Communicates Less

Wall-clock convergence time and communication rounds are critical performance metrics in distributed learning with parameter-server setting. While synchronous methods converge fast but are not robust to stragglers; and asynchronous ones can…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-22 Qiao Tan , Feng Zhu , Jingjing Zhang

Asynchronous Snapshots of Actor Systems for Latency-Sensitive Applications

The actor model is popular for many types of server applications. Efficient snapshotting of applications is crucial in the deployment of pre-initialized applications or moving running applications to different machines, e.g for debugging…

Programming Languages · Computer Science 2019-09-19 Dominik Aumayr , Stefan Marr , Elisa Gonzalez Boix , Hanspeter Mössenböck

System-aware dynamic partitioning for batch and streaming workloads

When processing data streams with highly skewed and nonstationary key distributions, we often observe overloaded partitions when the hash partitioning fails to balance data correctly. To avoid slow tasks that delay the completion of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-01 Zoltán Zvara , Péter G. N. Szabó , Balázs Barnabás Lóránt , András A. Benczúr

Towards Fine-Grained Scalability for Stateful Stream Processing Systems

Dynamic scaling is critical to stream processing engines, as their long-running nature demands adaptive resource management. Existing scaling approaches easily cause performance degradation due to coarse-grained synchronization and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-17 Yunfan Qing , Wenli Zheng

Efficient Time-Evolving Stream Processing at Scale

Time-evolving stream datasets exist ubiquitously in many real-world applications where their inherent hot keys often evolve over times. Nevertheless, few existing solutions can provide efficient load balance on these time-evolving datasets…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-05 Yu Huang

Amortized Posterior Sampling with Diffusion Prior Distillation

We propose Amortized Posterior Sampling (APS), a novel variational inference approach for efficient posterior sampling in inverse problems. Our method trains a conditional flow model to minimize the divergence between the variational…

Computer Vision and Pattern Recognition · Computer Science 2025-07-14 Abbas Mammadov , Hyungjin Chung , Jong Chul Ye

Downstream: efficient cross-platform algorithms for fixed-capacity stream downsampling

Due to ongoing accrual over long durations, a defining characteristic of real-world data streams is the requirement for rolling, often real-time, mechanisms to coarsen or summarize stream history. One common data structure for this purpose…

Data Structures and Algorithms · Computer Science 2025-06-17 Connor Yang , Joey Wagner , Emily Dolson , Luis Zaman , Matthew Andres Moreno

A cooperative partial snapshot algorithm for checkpoint-rollback recovery of large-scale and dynamic distributed systems and experimental evaluations

A distributed system consisting of a huge number of computational entities is prone to faults, because faults in a few nodes cause the entire system to fail. Consequently, fault tolerance of distributed systems is a critical issue.…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-30 Junya Nakamura , Yonghwan Kim , Yoshiaki Katayama , Toshimitsu Masuzawa

DynaHash: Efficient Data Rebalancing in Apache AsterixDB (Extended Version)

Parallel shared-nothing data management systems have been widely used to exploit a cluster of machines for efficient and scalable data processing. When a cluster needs to be dynamically scaled in or out, data must be efficiently rebalanced.…

Databases · Computer Science 2021-05-25 Chen Luo , Michael J. Carey

FaaS Execution Models for Edge Applications

In this paper, we address the problem of supporting stateful workflows following a Function-as-a-Service (FaaS) model in edge networks. In particular we focus on the problem of data transfer, which can be a performance bottleneck due to the…

Networking and Internet Architecture · Computer Science 2022-09-05 Claudio Cicconetti , Marco Conti , Andrea Passarella

Collaborative Reuse of Streaming Dataflows in IoT Applications

Distributed Stream Processing Systems (DSPS) like Apache Storm and Spark Streaming enable composition of continuous dataflows that execute persistently over data streams. They are used by Internet of Things (IoT) applications to analyze…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-10 Shilpa Chaturvedi , Sahil Tyagi , Yogesh Simmhan

AAFLOW: Scalable Patterns for Agentic AI Workflows

Agentic workflows in large language model systems integrate retrieval, reasoning, and memory, but existing frameworks suffer from scalability and reproducibility limitations due to fragmented data orchestration, serialization overhead, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-05 Arup Kumar Sarker , Mills Staylor , Aymen Alsaadi , Gregor von Laszewski , Shantenu Jha , Geoffrey Fox

AutoFlow: Hotspot-Aware, Dynamic Load Balancing for Distributed Stream Processing

Stream applications are widely deployed on the cloud. While modern distributed streaming systems like Flink and Spark Streaming can schedule and execute them efficiently, streaming dataflows are often dynamically changing, which may cause…

Systems and Control · Electrical Eng. & Systems 2021-03-17 Pengqi Lu , Liang Yuan , Yunquan Zhang , Hang Cao , Kun Li