Related papers: CIAO: An Optimization Framework for Client-Assiste…

Dynamic Data Layout Optimization with Worst-case Guarantees

Many data analytics systems store and process large datasets in partitions containing millions of rows. By mapping rows to partitions in an optimized way, it is possible to improve query performance by skipping over large numbers of…

Databases · Computer Science 2024-05-09 Kexin Rong , Paul Liu , Sarah Ashok Sonje , Moses Charikar

Aion: Better Late than Never in Event-Time Streams

Processing data streams in near real-time is an increasingly important task. In the case of event-timestamped data, the stream processing system must promptly handle late events that arrive after the corresponding window has been processed.…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-24 Sérgio Esteves , Gianmarco De Francisci Morales , Rodrigo Rodrigues , Marco Serafini , Luís Veiga

Introducing the Task-Aware Storage I/O (TASIO) Library

Task-based programming models are excellent tools to parallelize and seamlessly load balance an application workload. However, the integration of I/O intensive applications and task-based programming models is lacking. Typically, I/O…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-30 Aleix Roca Nonell , Vicenç Beltran Querol , Sergi Mateo Bellido

CIAO: Cache Interference-Aware Throughput-Oriented Architecture and Scheduling for GPUs

A modern GPU aims to simultaneously execute more warps for higher Thread-Level Parallelism (TLP) and performance. When generating many memory requests, however, warps contend for limited cache space and thrash cache, which in turn severely…

Hardware Architecture · Computer Science 2018-05-22 Jie Zhang , Shuwen Gao , Nam Sung Kim , Myoungsoo Jung

Accelerating Data Loading in Deep Neural Network Training

Data loading can dominate deep neural network training time on large-scale systems. We present a comprehensive study on accelerating data loading performance in large-scale distributed training. We first identify performance and scalability…

Machine Learning · Computer Science 2020-02-20 Chih-Chieh Yang , Guojing Cong

EMLIO: Minimizing I/O Latency and Energy Consumption for Large-Scale AI Training

Large-scale deep learning workloads increasingly suffer from I/O bottlenecks as datasets grow beyond local storage capacities and GPU compute outpaces network and disk latencies. While recent systems optimize data-loading time, they…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-18 Hasibul Jamil , MD S Q Zulkar Nine , Tevfik Kosar

Move Fast and Meet Deadlines: Fine-grained Real-time Stream Processing with Cameo

Resource provisioning in multi-tenant stream processing systems faces the dual challenges of keeping resource utilization high (without over-provisioning), and ensuring performance isolation. In our common production use cases, where…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-08 Le Xu , Shivaram Venkataraman , Indranil Gupta , Luo Mai , Rahul Potharaju

On picking operations in e-commerce warehouses: Insights from the complete-information counterpart

Major players in e-commerce process dynamically incoming orders in real-time and already use advanced anticipation techniques, like AI, to predict characteristics of future orders. However, at the warehousing level, there are still no…

Optimization and Control · Mathematics 2024-10-21 Catherine Lorenz , Alena Otto , Michel Gendreau

IOCA: High-Speed I/O-Aware LLC Management for Network-Centric Multi-Tenant Platform

In modern server CPUs, last-level cache (LLC) is a critical hardware resource that exerts significant influence on the performance of the workloads, and how to manage LLC is a key to the performance isolation and QoS in the cloud with…

Hardware Architecture · Computer Science 2021-03-05 Yifan Yuan , Mohammad Alian , Yipeng Wang , Ilia Kurakin , Ren Wang , Charlie Tai , Nam Sung Kim

Cost Models for Big Data Query Processing: Learning, Retrofitting, and Our Findings

Query processing over big data is ubiquitous in modern clouds, where the system takes care of picking both the physical query execution plans and the resources needed to run those plans, using a cost-based query optimizer. A good cost…

Databases · Computer Science 2020-03-02 Tarique Siddiqui , Alekh Jindal , Shi Qiao , Hiren Patel , Wangchao le

Periodic I/O scheduling for super-computers

With the ever-growing need of data in HPC applications, the congestion at the I/O level becomes critical in super-computers. Architectural enhancement such as burst-buffers and pre-fetching are added to machines, but are not sufficient to…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-02-23 Guillaume Aupy , Ana Gainaru , Valentin Le Fèvre

CARGO : Context Augmented Critical Region Offload for Network-bound datacenter Workloads

Network bound applications, like a database server executing OLTP queries or a caching server storing objects for a dynamic web applications, are essential services that consumers and businesses use daily. These services run on a large…

Hardware Architecture · Computer Science 2020-08-18 Siddharth Rai , Trevor E. Carlson

Jiagu: Optimizing Serverless Computing Resource Utilization with Harmonized Efficiency and Practicability

Current serverless platforms struggle to optimize resource utilization due to their dynamic and fine-grained nature. Conventional techniques like overcommitment and autoscaling fall short, often sacrificing utilization for practicability or…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-03-04 Qingyuan Liu , Yanning Yang , Dong Du , Yubin Xia , Ping Zhang , Jia Feng , James Larus , Haibo Chen

Patience-aware Scheduling for Cloud Services: Freeing Users from the Chains of Boredom

Scheduling of service requests in Cloud computing has traditionally focused on the reduction of pre-service wait, generally termed as waiting time. Under certain conditions such as peak load, however, it is not always possible to give…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-08-21 Carlos Cardonha , Marcos D. Assunção , Marco A. S. Netto , Renato L. F. Cunha , Carlos Queiroz

Minos: Exploiting Cloud Performance Variation with Function-as-a-Service Instance Selection

Serverless Function-as-a-Service (FaaS) is a popular cloud paradigm to quickly and cheaply implement complex applications. Because the function instances cloud providers start to execute user code run on shared infrastructure, their…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-27 Trever Schirmer , Natalie Carl , Nils Höller , Tobias Pfandzelter , David Bermbach

LIMAO: A Framework for Lifelong Modular Learned Query Optimization

Query optimizers are crucial for the performance of database systems. Recently, many learned query optimizers (LQOs) have demonstrated significant performance improvements over traditional optimizers. However, most of them operate under a…

Databases · Computer Science 2025-07-02 Qihan Zhang , Shaolin Xie , Ibrahim Sabek

Dynamic Deferral of Workload for Capacity Provisioning in Data Centers

Recent increase in energy prices has led researchers to find better ways for capacity provisioning in data centers to reduce energy wastage due to the variation in workload. This paper explores the opportunity for cost saving utilizing the…

Networking and Internet Architecture · Computer Science 2015-03-19 Muhammad Abdullah Adnan , Ryo Sugihara , Yan Ma , Rajesh Gupta

Cost Optimization for Serverless Edge Computing with Budget Constraints using Deep Reinforcement Learning

Serverless computing adopts a pay-as-you-go billing model where applications are executed in stateless and shortlived containers triggered by events, resulting in a reduction of monetary costs and resource utilization. However, existing…

Networking and Internet Architecture · Computer Science 2025-01-27 Chen Chen , Peiyuan Guan , Ziru Chen , Amir Taherkordi , Fen Hou , Lin X. Cai

Simple and Effective Dynamic Provisioning for Power-Proportional Data Centers

Energy consumption represents a significant cost in data center operation. A large fraction of the energy, however, is used to power idle servers when the workload is low. Dynamic provisioning techniques aim at saving this portion of the…

Performance · Computer Science 2012-02-28 Tan Lu , Minghua Chen

CDIO: Cross-Domain Inference Optimization with Resource Preference Prediction for Edge-Cloud Collaboration

Currently, massive video tasks are processed by edge-cloud collaboration. However, the diversity of task requirements and the dynamics of resources pose great challenges to efficient inference, resulting in many wasted resources. In this…

Multimedia · Computer Science 2025-02-07 Zheming Yang , Wen Ji , Qi Guo , Dieli Hu , Chang Zhao , Xiaowei Li , Xuanlei Zhao , Yi Zhao , Chaoyu Gong , Yang You