Related papers: Stream parallel skeleton optimization
Parallel computing is very important to accelerate the performance of software systems. Additionally, considering that a recurring challenge is to process high data volumes continuously, stream processing emerged as a paradigm and software…
Stream computation is one of the approaches suitable for FPGA-based custom computing due to its high throughput capability brought by pipelining with regular memory access. To increase performance of iterative stream computation, we can…
Many modern applications require real-time processing of large volumes of high-speed data. Such data processing needs can be modeled as a streaming computation. A streaming computation is specified as a dataflow graph that exposes multiple…
Algorithmic skeletons are used as building-blocks to ease the task of parallel programming by abstracting the details of parallel implementation from the developer. Most existing libraries provide implementations of skeletons that are…
Stream computing is the use of multiple autonomic and parallel modules together with integrative processors at a higher level of abstraction to embody "intelligent" processing. The biological basis of this computing is sketched and the…
Operating a distributed data stream processing workload efficiently at scale is hard. The operator of the workload must parallelize and lay out tasks of the workload with resources that match the requirement of target data rate. The…
Streams are infinite sequences over a given data type. A stream specification is a set of equations intended to define a stream. A core property is productivity: unfolding the equations produces the intended stream in the limit. In this…
We construct parametrized isometric tensor network states -- referred to as skeletons -- that allow us to explore phases of abelian topological order and can be efficiently implemented on quantum processors. We obtain stable finite…
We define a broad class of deterministic stream functions and show they can be implemented as homomorphisms into a "state" monoid. The homomorphism laws are simpler than the conditions of previous semantic frameworks for stream program…
The structural design process for buildings is time-consuming and laborious. To automate this process, structural engineers combine optimization methods with simulation tools to find an optimal design with minimal building mass subject to…
Parallel batched data structures are designed to process synchronized batches of operations in a parallel computing model. In this paper, we propose parallel combining, a technique that implements a concurrent data structure from a parallel…
Pipeline parallelism is one of the key components for large-scale distributed training, yet its efficiency suffers from pipeline bubbles which were deemed inevitable. In this work, we introduce a scheduling strategy that, to our knowledge,…
Stream Processing (SP) has evolved as the leading paradigm to process and gain value from the high volume of streaming data produced e.g. in the domain of the Internet of Things. An SP system is a middleware that deploys a network of…
Streaming computations on massive data sets are an attractive candidate for parallelization, particularly when they exhibit independence (and hence data parallelism) between items in the stream. However, some streaming computations are…
The construction of Mapper has emerged in the last decade as a powerful and effective topological data analysis tool that approximates and generalizes other topological summaries, such as the Reeb graph, the contour tree, split, and joint…
In this report, we present a new programming model based on Pipelines and Operators, which are the building blocks of programs written in PiCo, a DSL for Data Analytics Pipelines. In the model we propose, we use the term Pipeline to denote…
Pipeline is a fundamental parallel programming pattern. Mainstream pipeline programming frameworks count on data abstractions to perform pipeline scheduling. This design is convenient for data-centric pipeline applications but inefficient…
We study the problem of minimizing total completion time on parallel machines subject to varying processing capacity. In this paper, we develop an approximation scheme for the problem under the data stream model where the input data is…
The paper introduces PDSP-Bench, a novel benchmarking system designed for a systematic understanding of performance of parallel stream processing in a distributed environment. Such an understanding is essential for determining how Stream…
To train modern large DNN models, pipeline parallelism has recently emerged, which distributes the model across GPUs and enables different devices to process different microbatches in pipeline. Earlier pipeline designs allow multiple…