Related papers: CIAO: An Optimization Framework for Client-Assiste…
Many data analytics systems store and process large datasets in partitions containing millions of rows. By mapping rows to partitions in an optimized way, it is possible to improve query performance by skipping over large numbers of…
Processing data streams in near real-time is an increasingly important task. In the case of event-timestamped data, the stream processing system must promptly handle late events that arrive after the corresponding window has been processed.…
Task-based programming models are excellent tools to parallelize and seamlessly load balance an application workload. However, the integration of I/O intensive applications and task-based programming models is lacking. Typically, I/O…
A modern GPU aims to simultaneously execute more warps for higher Thread-Level Parallelism (TLP) and performance. When generating many memory requests, however, warps contend for limited cache space and thrash cache, which in turn severely…
Data loading can dominate deep neural network training time on large-scale systems. We present a comprehensive study on accelerating data loading performance in large-scale distributed training. We first identify performance and scalability…
Large-scale deep learning workloads increasingly suffer from I/O bottlenecks as datasets grow beyond local storage capacities and GPU compute outpaces network and disk latencies. While recent systems optimize data-loading time, they…
Resource provisioning in multi-tenant stream processing systems faces the dual challenges of keeping resource utilization high (without over-provisioning), and ensuring performance isolation. In our common production use cases, where…
Major players in e-commerce process dynamically incoming orders in real-time and already use advanced anticipation techniques, like AI, to predict characteristics of future orders. However, at the warehousing level, there are still no…
In modern server CPUs, last-level cache (LLC) is a critical hardware resource that exerts significant influence on the performance of the workloads, and how to manage LLC is a key to the performance isolation and QoS in the cloud with…
Query processing over big data is ubiquitous in modern clouds, where the system takes care of picking both the physical query execution plans and the resources needed to run those plans, using a cost-based query optimizer. A good cost…
With the ever-growing need of data in HPC applications, the congestion at the I/O level becomes critical in super-computers. Architectural enhancement such as burst-buffers and pre-fetching are added to machines, but are not sufficient to…
Network bound applications, like a database server executing OLTP queries or a caching server storing objects for a dynamic web applications, are essential services that consumers and businesses use daily. These services run on a large…
Current serverless platforms struggle to optimize resource utilization due to their dynamic and fine-grained nature. Conventional techniques like overcommitment and autoscaling fall short, often sacrificing utilization for practicability or…
Scheduling of service requests in Cloud computing has traditionally focused on the reduction of pre-service wait, generally termed as waiting time. Under certain conditions such as peak load, however, it is not always possible to give…
Serverless Function-as-a-Service (FaaS) is a popular cloud paradigm to quickly and cheaply implement complex applications. Because the function instances cloud providers start to execute user code run on shared infrastructure, their…
Query optimizers are crucial for the performance of database systems. Recently, many learned query optimizers (LQOs) have demonstrated significant performance improvements over traditional optimizers. However, most of them operate under a…
Recent increase in energy prices has led researchers to find better ways for capacity provisioning in data centers to reduce energy wastage due to the variation in workload. This paper explores the opportunity for cost saving utilizing the…
Serverless computing adopts a pay-as-you-go billing model where applications are executed in stateless and shortlived containers triggered by events, resulting in a reduction of monetary costs and resource utilization. However, existing…
Energy consumption represents a significant cost in data center operation. A large fraction of the energy, however, is used to power idle servers when the workload is low. Dynamic provisioning techniques aim at saving this portion of the…
Currently, massive video tasks are processed by edge-cloud collaboration. However, the diversity of task requirements and the dynamics of resources pose great challenges to efficient inference, resulting in many wasted resources. In this…