Related papers: Compile-Time Java Stream Fusion via mapMulti
Stream processing is mainstream (again): Widely-used stream libraries are now available for virtually all modern OO and functional languages, from Java to C# to Scala to OCaml to Haskell. Yet expressivity and performance are still lacking.…
The Stream API was added in Java 8 to allow the declarative expression of data-processing logic, typically map-reduce-like data transformations on collections and datasets. The Stream API introduces two key abstractions. The stream, which…
The introduction of lambdas in Java 8 completes the slate of statically-typed, mainstream languages with both object-oriented and functional features. The main motivation for lambdas in Java has been to facilitate stream-based declarative…
The Java Stream API aims at increasing developer productivity thanks to an easy-to-read declarative syntax to express computations. It also simplifies parallel computing, providing a high-level abstraction on top of common parallelization…
We present a system of efficient methods for traversing and combining associative collection data structures. A distinguishing feature of the system is that, like traditional sequential iterator libraries, it does not require specialized…
Efficient execution of deep learning workloads on dataflow architectures is crucial for overcoming memory bottlenecks and maximizing performance. While streaming intermediate results between computation kernels can significantly improve…
Occlusions between consecutive frames have long posed a significant challenge in optical flow estimation. The inherent ambiguity introduced by occlusions directly violates the brightness constancy constraint and considerably hinders…
To increase performance and efficiency, systems use FPGAs as reconfigurable accelerators. A key challenge in designing these systems is partitioning computation between processors and an FPGA. An appropriate division of labor may be…
We present the stream processing library that achieves the highest performance of existing OCaml streaming libraries, attaining the speed and memory efficiency of hand-written state machines. It supports finite and infinite streams with the…
Stream-based runtime monitors are safety assurance tools that check at runtime whether the system's behavior satisfies a formal specification. Specifications consist of stream equations, which relate input streams, containing sensor…
We address the joint optimization of multiple stream joins in a scale-out architecture by tailoring prior work on multi-way stream joins to predicate-driven data partitioning schemes. We present an integer linear programming (ILP)…
Processing large amounts of data fast, in constant and small space is the point of stream processing and the reason for its increasing use. Alas, the most performant, imperative processing code tends to be almost impossible to read, let…
Many modern virtual machines, such as JVMs, .NET Framework, and V8, employ a just-in-time (JIT) compiler to achieve their high-performance. There are two major compilation strategies; trace-based compilation and method-based compilation.…
There are billions of lines of sequential code inside nowadays' software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the…
Pipeline is a fundamental parallel programming pattern. Mainstream pipeline programming frameworks count on data abstractions to perform pipeline scheduling. This design is convenient for data-centric pipeline applications but inefficient…
In GPU-accelerated data analytics, the overhead of data transfer from CPU to GPU becomes a performance bottleneck when the data scales beyond GPU memory capacity due to the limited PCIe bandwidth. Data compression has come to rescue for…
StreamSampling$.$jl is a Julia library designed to provide general and efficient methods for sampling from data streams in a single pass, even when the total number of items is unknown. In this paper, we describe the capabilities of the…
The explosion of Big Data was followed by the proliferation of numerous complex parallel software stacks whose aim is to tackle the challenges of data deluge. A drawback of a such multi-layered hierarchical deployment is the inability to…
Shared memory multiprocessors come back to popularity thanks to rapid spreading of commodity multi-core architectures. As ever, shared memory programs are fairly easy to write and quite hard to optimise; providing multi-core programmers…
As the landscape of deep neural networks evolves, heterogeneous dataflow accelerators, in the form of multi-core architectures or chiplet-based designs, promise more flexibility and higher inference performance through scalability. So far,…