English
Related papers

Related papers: ShuffleBench: A Benchmark for Large-Scale Data Shu…

200 papers

Recent advancements in data stream processing frameworks have improved real-time data handling, however, scalability remains a significant challenge affecting throughput and latency. While studies have explored this issue on local machines…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-04 Apurv Deepak Kulkarni , Siavash Ghiasvand

The paper introduces PDSP-Bench, a novel benchmarking system designed for a systematic understanding of performance of parallel stream processing in a distributed environment. Such an understanding is essential for determining how Stream…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-16 Pratyush Agnihotri , Boris Koldehofe , Roman Heinrich , Carsten Binnig , Manisha Luthra

Nowadays, several software systems rely on stream processing architectures to deliver scalable performance and handle large volumes of data in near real-time. Stream processing frameworks facilitate scalable computing by distributing the…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-30 Adriano Vogel , Sören Henning , Esteban Perez-Wohlfeil , Otmar Ertl , Rick Rabiser

Context: The combination of distributed stream processing with microservice architectures is an emerging pattern for building data-intensive software systems. In such systems, stream processing frameworks such as Apache Flink, Apache Kafka…

Software Engineering · Computer Science 2023-11-02 Sören Henning , Wilhelm Hasselbring

The need for scalable and efficient stream analysis has led to the development of many open-source streaming data processing systems (SDPSs) with highly diverging capabilities and performance characteristics. While first initiatives try to…

Databases · Computer Science 2019-06-27 Jeyhun Karimov , Tilmann Rabl , Asterios Katsifodimos , Roman Samarev , Henri Heiskanen , Volker Markl

Growing data volumes and velocities in fields such as Industry 4.0 or the Internet of Things have led to the increased popularity of data stream processing systems. Enterprises can leverage these developments by enriching their core…

Performance · Computer Science 2021-03-12 Guenter Hesse , Christoph Matthies , Michael Perscheid , Matthias Uflacker , Hasso Plattner

Distributed stream processing engines are designed with a focus on scalability to process big data volumes in a continuous manner. We present the Theodolite method for benchmarking the scalability of distributed stream processing engines.…

Software Engineering · Computer Science 2021-02-12 Sören Henning , Wilhelm Hasselbring

Distributed data processing frameworks (e.g., Hadoop, Spark, and Flink) are widely used to distribute data among computing nodes of a cloud. Recently, there have been increasing efforts aimed at evaluating the performance of distributed…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-07 Faheem Ullah , Shagun Dhingra , Xiaoyu Xia , M. Ali Babar

Parallel computing is very important to accelerate the performance of software systems. Additionally, considering that a recurring challenge is to process high data volumes continuously, stream processing emerged as a paradigm and software…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-14 Adriano Vogel , Sören Henning , Esteban Perez-Wohlfeil , Otmar Ertl , Rick Rabiser

Performance benchmarking is a common practice in software engineering, particularly when building large-scale, distributed, and data-intensive systems. While cloud environments offer several advantages for running benchmarks, it is often…

Software Engineering · Computer Science 2025-04-17 Sören Henning , Adriano Vogel , Esteban Perez-Wohlfeil , Otmar Ertl , Rick Rabiser

The prevalence of scientific workflows with high computational demands calls for their execution on various distributed computing platforms, including large-scale leadership-class high-performance computing (HPC) clusters. To handle the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-10 Tainã Coleman , Henri Casanova , Ketan Maheshwari , Loïc Pottier , Sean R. Wilkinson , Justin Wozniak , Frédéric Suter , Mallikarjun Shankar , Rafael Ferreira da Silva

In the fields of big data, AI, and streaming processing, we work with large amounts of data from multiple sources. Due to memory and network limitations, we process data streams on distributed systems to alleviate computational and network…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-18 József Dániel Gáspár , Martin Horváth , Győző Horváth , Zoltán Zvara

Stream processing is a compute paradigm that promises safe and efficient parallelism. Modern big-data problems are often well suited for stream processing's throughput-oriented nature. Realization of efficient stream processing requires…

Performance · Computer Science 2015-04-14 Jonathan C. Beard , Roger D. Chamberlain

Apache Flink is an open-source system for scalable processing of batch and streaming data. Flink does not natively support efficient processing of spatial data streams, which is a requirement of many applications dealing with spatial data.…

Databases · Computer Science 2020-08-04 Salman Ahmed Shaikh , Komal Mariam , Hiroyuki Kitagawa , Kyoung-Sook Kim

Making serverless computing widely applicable requires detailed performance understanding. Although contemporary benchmarking approaches exist, they report only coarse results, do not apply distributed tracing, do not consider asynchronous…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-17 Joel Scheuner , Simon Eismann , Sacheendra Talluri , Erwin van Eyk , Cristina Abad , Philipp Leitner , Alexandru Iosup

Shared memory multiprocessors come back to popularity thanks to rapid spreading of commodity multi-core architectures. As ever, shared memory programs are fairly easy to write and quite hard to optimise; providing multi-core programmers…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-09-10 Marco Aldinucci , Massimo Torquati , Massimiliano Meneghin

Distributed data processing platforms for cloud computing are important tools for large-scale data analytics. Apache Hadoop MapReduce has become the de facto standard in this space, though its programming interface is relatively low-level,…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-30 Bilal Akil , Ying Zhou , Uwe Röhm

Distributed Stream Processing frameworks are being commonly used with the evolution of Internet of Things(IoT). These frameworks are designed to adapt to the dynamic input message rate by scaling in/out.Apache Storm, originally developed by…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-10 Anshu Shukla , Yogesh Simmhan

Distributed inference serves as a promising approach to enabling the inference of large language models (LLMs) at the network edge. It distributes the inference process to multiple devices to ensure that the LLMs can fit into the device…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-13 Xing Liu , Lizhuo Luo , Ming Tang , Chao Huang , Xu Chen

Today, we have to deal with many data (Big data) and we need to make decisions by choosing an architectural framework to analyze these data coming from different area. Due to this, it become problematic when we want to process these data,…

Software Engineering · Computer Science 2019-01-29 Youness Dendane , Fabio Petrillo , Hamid Mcheick , Souhail Ben Ali
‹ Prev 1 2 3 10 Next ›