Related papers: Process Faster, Pay Less: Functional Isolation for…
Current systems for data-parallel, incremental processing and view maintenance over high-rate streams isolate the execution of independent queries. This creates unwanted redundancy and overhead in the presence of concurrent incrementally…
Serverless computing and stream processing represent two dominant paradigms for event-driven data processing, yet both make assumptions that render them inefficient for short-running, lightweight, and unpredictable streams that require…
Despite many advances in query optimization, indexing techniques, and data storage, modern data platforms still face difficulties in delivering robust query performance under high concurrency and computationally intensive queries. This…
The Function-as-a-Service (FaaS) execution model increases developer productivity by removing operational concerns such as managing hardware or software runtimes. Developers, however, still need to partition their applications into FaaS…
The exponential growth in smart sensors and rapid progress in 5G networks is creating a world awash with data streams. However, a key barrier to building performant multi-sensor, distributed stream processing applications is high…
There is increasing interest in using multicore processors to accelerate stream processing. For example, indexing sliding window content to enhance the performance of streaming queries is greatly improved by utilizing the computational…
Resource provisioning in multi-tenant stream processing systems faces the dual challenges of keeping resource utilization high (without over-provisioning), and ensuring performance isolation. In our common production use cases, where…
Modern datacenter switches share packet buffers across ports to boost overall throughput and reduce packet loss. However, as buffer availability per-port-per-bandwidth unit continues to decrease, existing buffer-sharing strategies face…
Efficient execution of deep learning workloads on dataflow architectures is crucial for overcoming memory bottlenecks and maximizing performance. While streaming intermediate results between computation kernels can significantly improve…
Despite all the available commercial and open-source frameworks to ease deploying FPGAs in accelerating applications, the current schemes fail to support sharing multiple accelerators among various applications. There are three main…
Stream processing is a compute paradigm that promises safe and efficient parallelism. Modern big-data problems are often well suited for stream processing's throughput-oriented nature. Realization of efficient stream processing requires…
Hospitals around the world collect massive amounts of physiological data from their patients every day. Recently, there has been an increase in research interest to subject this data to statistical analysis to gain more insights and provide…
Stream processing is usually done either on a tuple-by-tuple basis or in micro-batches. There are many applications where tuples over a predefined duration/window must be processed within certain deadlines. Processing such queries using…
Serverless computing is an excellent fit for big data processing because it can scale quickly and cheaply to thousands of parallel functions. Existing serverless platforms isolate functions in ephemeral, stateless containers, preventing…
The paper introduces PDSP-Bench, a novel benchmarking system designed for a systematic understanding of performance of parallel stream processing in a distributed environment. Such an understanding is essential for determining how Stream…
Stream processing has become a critical component in the architecture of modern applications. With the exponential growth of data generation from sources such as the Internet of Things, business intelligence, and telecommunications,…
Serverless computing provides just-in-time infrastructure provisioning with rapid elasticity and a finely-grained pricing model. As full control of resource allocation is in the hands of the cloud provider and applications only consume…
Time-evolving stream datasets exist ubiquitously in many real-world applications where their inherent hot keys often evolve over times. Nevertheless, few existing solutions can provide efficient load balance on these time-evolving datasets…
In the burgeoning realm of Internet of Things (IoT) applications on edge devices, data stream compression has become increasingly pertinent. The integration of added compression overhead and limited hardware resources on these devices calls…
Often, machine learning applications have to cope with dynamic environments where data are collected in the form of continuous data streams with potentially infinite length and transient behavior. Compared to traditional (batch) data…