Related papers: POTUS: Predictive Online Tuple Scheduling for Data…
Stream processing is usually done either on a tuple-by-tuple basis or in micro-batches. There are many applications where tuples over a predefined duration/window must be processed within certain deadlines. Processing such queries using…
Many applications process a stream of tuples over a window duration, and require the results within a specified deadline after the end of the window. For such scenarios, processing tuples intermittently (in batches) instead of eagerly…
Operating a distributed data stream processing workload efficiently at scale is hard. The operator of the workload must parallelize and lay out tasks of the workload with resources that match the requirement of target data rate. The…
Distributed Stream Processing frameworks are being commonly used with the evolution of Internet of Things(IoT). These frameworks are designed to adapt to the dynamic input message rate by scaling in/out.Apache Storm, originally developed by…
For wireless caching networks, the scheme design for content delivery is non-trivial in the face of the following tradeoff. On one hand, to optimize overall throughput, users can associate their nearby APs with great channel capacities;…
Data processing frameworks such as Apache Beam and Apache Spark are used for a wide range of applications, from logs analysis to data preparation for DNN training. It is thus unsurprising that there has been a large amount of work on…
Efficient matching of incoming events of data streams to persistent queries is fundamental to event stream processing systems. These applications require dealing with high volume and continuous data streams with fast processing time on…
Motivated by the increasing importance of providing delay-guaranteed services in general computing and communication systems, and the recent wide adoption of learning and prediction in network control, in this work, we consider a general…
Current main memory database system architectures are still challenged by high contention workloads and this challenge will continue to grow as the number of cores in processors continues to increase. These systems schedule transactions…
Due to the distribution of linked data across the web, the methods that process federated queries through a distributed approach are more attractive to the users and have gained more prosperity. In distributed processing of federated…
CPU scheduling is the reason behind the performance of multiprocessing and in time-shared operating systems. Different scheduling criteria are used to evaluate Central Processing Unit Scheduling algorithms which are based on different…
Time-evolving stream datasets exist ubiquitously in many real-world applications where their inherent hot keys often evolve over times. Nevertheless, few existing solutions can provide efficient load balance on these time-evolving datasets…
To deliver high performance in power limited systems, architects have turned to using heterogeneous systems, either CPU+GPU or mixed CPU-hardware systems. However, in systems with different processor types and task affinities, scheduling…
We present a number of novel algorithms, based on mathematical optimization formulations, in order to solve a homogeneous multiprocessor scheduling problem, while minimizing the total energy consumption. In particular, for a system with a…
Modern commodity computing systems are composed by a number of different heterogeneous processing units, each of which has its own unique performance and energy characteristics. However, the majority of current network packet processing…
Whilst computational resources at the cloud edge can be leveraged to improve latency and reduce the costs of cloud services for a wide variety mobile, web, and IoT applications; such resources are naturally constrained. For distributed…
The proliferation of big data and analytic workloads has driven the need for cloud compute and cluster-based job processing. With Apache Spark, users can process terabytes of data at ease with hundreds of parallel executors. At Microsoft,…
Serverless computing has transformed cloud application deployment by introducing a fine-grained, event-driven execution model that abstracts away infrastructure management. Its on-demand nature makes it especially appealing for…
The era of big data has led to the emergence of new systems for real-time distributed stream processing, e.g., Apache Storm is one of the most popular stream processing systems in industry today. However, Storm, like many other stream…
We consider the flow network model to solve the multiprocessor real-time task scheduling problems. Using the flow network model or its generic form, linear programming (LP) formulation, for the problems is not new. However, the previous…