Related papers: BriskStream: Scaling Data Stream Processing on Sha…
Recent data stream processing systems (DSPSs) can achieve excellent performance when processing large volumes of data under tight latency constraints. However, they sacrifice support for concurrent state access that eases the burden of…
Data stream processing systems (DSPSs) enable users to express and run stream applications to continuously process data streams. To achieve real-time data analytics, recent researches keep focusing on optimizing the system latency and…
Many modern applications require real-time processing of large volumes of high-speed data. Such data processing needs can be modeled as a streaming computation. A streaming computation is specified as a dataflow graph that exposes multiple…
Transactional Stream Processing Engines (TSPEs) form the backbone of modern stream applications handling shared mutable states. Yet, the full potential of these systems, specifically in exploiting parallelism and implementing dynamic…
Memory disaggregation has emerged as an alternative to traditional server architecture in data centers. This paper introduces DRackSim, a simulation infrastructure to model rack-scale hardware disaggregated memory. DRackSim models multiple…
This paper presents a novel high speed clustering scheme for high dimensional data streams. Data stream clustering has gained importance in different applications, for example, in network monitoring, intrusion detection, and real-time…
Disaggregation and rack-scale systems have the potential of drastically decreasing TCO and increasing utilization of cloud datacenters, while maintaining performance. While the concept of organising resources in separate pools and…
In this paper, we investigate three cross-facility data streaming architectures, Direct Streaming (DTS), Proxied Streaming (PRS), and Managed Service Streaming (MSS). We examine their architectural variations in data flow paths and…
BLAS Level 3 operations are essential for scientific computing, but finding the optimal number of threads for multi-threaded implementations on modern multi-core systems is challenging. We present an extension to the Architecture and…
High-level synthesis (HLS) has enabled the rapid development of custom hardware circuits for many software applications. However, developing high-performance hardware circuits using HLS is still a non-trivial task requiring expertise in…
Stream reasoning systems are designed for complex decision-making from possibly infinite, dynamic streams of data. Modern approaches to stream reasoning are usually performing their computations using stand-alone solvers, which…
Distributed Stream Processing Systems (DSPSs) are among the currently most emerging topics in data management, with applications ranging from real-time event monitoring to processing complex dataflow programs and big data analytics. The…
Multi-socket multi-core servers are used for solving some of the important problems in computing. Remote DRAM accesses can impact performance of certain applications running on such servers. This paper presents a new near linear operating…
In the burgeoning realm of Internet of Things (IoT) applications on edge devices, data stream compression has become increasingly pertinent. The integration of added compression overhead and limited hardware resources on these devices calls…
Recent trends see a move away from a fixed-resource server-centric datacenter model to a more adaptable "disaggregated" datacenter model. These disaggregated datacenters can then dynamically group resources to the specific requirements of…
Ever-increasing amounts of data and requirements to process them in real time lead to more and more analytics platforms and software systems being designed according to the concept of stream processing. A common area of application is the…
As computer clusters become more common and the size of the problems encountered in the field of AI grows, there is an increasing demand for efficient parallel inference algorithms. We consider the problem of parallel inference on large…
As the landscape of deep neural networks evolves, heterogeneous dataflow accelerators, in the form of multi-core architectures or chiplet-based designs, promise more flexibility and higher inference performance through scalability. So far,…
The paper introduces PDSP-Bench, a novel benchmarking system designed for a systematic understanding of performance of parallel stream processing in a distributed environment. Such an understanding is essential for determining how Stream…
Our ISCA 2015 paper provides a new programmable processing-in-memory (PIM) architecture and system design that can accelerate key data-intensive applications, with a focus on graph processing workloads. Our major idea was to completely…