Related papers: Starling: A Scalable Query Engine on Cloud Functio…

Saving Money for Analytical Workloads in the Cloud

As users migrate their analytical workloads to cloud databases, it is becoming just as important to reduce monetary costs as it is to optimize query runtime. In the cloud, a query is billed based on either its compute time or the amount of…

Databases · Computer Science 2024-08-02 Tapan Srivastava , Raul Castro Fernandez

Flock: A Low-Cost Streaming Query Engine on FaaS Platforms

Existing serverless data analytics systems rely on external storage services like S3 for data shuffling and communication between cloud functions. While this approach provides the elasticity benefits of serverless computing, it incurs…

Databases · Computer Science 2024-04-23 Gang Liao , Amol Deshpande , Daniel J. Abadi

An Empirical Evaluation of Serverless Cloud Infrastructure for Large-Scale Data Processing

Data processing systems are increasingly deployed in the cloud. While monolithic systems run fully on virtual servers, recent systems embrace cloud infrastructure and utilize the disaggregation of compute and storage to scale them…

Databases · Computer Science 2025-01-15 Thomas Bodner , Theo Radig , David Justen , Daniel Ritter , Tilmann Rabl

Skyrise: Exploiting Serverless Cloud Infrastructure for Elastic Data Processing

Serverless computing offers elasticity unmatched by conventional server-based cloud infrastructure. Although modern data processing systems embrace serverless storage, such as Amazon S3, they continue to manage their compute resources as…

Databases · Computer Science 2025-01-29 Thomas Bodner , Daniel Ritter , Martin Boissier , Tilmann Rabl

Data-driven scheduling in serverless computing to reduce response time

In Function as a Service (FaaS), a serverless computing variant, customers deploy functions instead of complete virtual machines or Linux containers. It is the cloud provider who maintains the runtime environment for these functions. FaaS…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-16 Bartłomiej Przybylski , Paweł Żuk , Krzysztof Rzadca

Formal Foundations of Serverless Computing

Serverless computing (also known as functions as a service) is a new cloud computing abstraction that makes it easier to write robust, large-scale web services. In serverless computing, programmers write what are called serverless…

Programming Languages · Computer Science 2021-03-12 Abhinav Jangda , Donald Pinckney , Yuriy Brun , Arjun Guha

Flora: Efficient Cloud Resource Selection for Big Data Processing via Job Classification

Distributed dataflow systems like Spark and Flink enable data-parallel processing of large datasets on clusters of cloud resources. Yet, selecting appropriate computational resources for dataflow jobs is often challenging. For efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-03 Jonathan Will , Lauritz Thamsen , Jonathan Bader , Odej Kao

CWD: A Machine Learning based Approach to Detect Unknown Cloud Workloads

Workloads in modern cloud data centers are becoming increasingly complex. The number of workloads running in cloud data centers has been growing exponentially for the last few years, and cloud service providers (CSP) have been supporting…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-11-30 Mohammad Hossain , Derssie Mebratu , Niranjan Hasabnis , Jun Jin , Gaurav Chaudhary , Noah Shen

Scheduling Methods to Reduce Response Latency of Function as a Service

Function as a Service (FaaS) permits cloud customers to deploy to cloud individual functions, in contrast to complete virtual machines or Linux containers. All major cloud providers offer FaaS products (Amazon Lambda, Google Cloud…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-12 Pawel Zuk , Krzysztof Rzadca

Cluster Resource Management for Dynamic Workloads by Online Optimization

Over the past ten years, many different approaches have been proposed for different aspects of the problem of resources management for long running, dynamic and diverse workloads such as processing query streams or distributed deep…

Performance · Computer Science 2023-08-24 Nader Alfares , George Kesidis , Ata Fatahi Baarzi , Aman Jain

A Language-based Serverless Function Accelerator

Serverless computing is an approach to cloud computing that allows programmers to run serverless functions in response to external events. Serverless functions are priced at sub-second granularity, support transparent elasticity, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-05 Emily Herbert , Arjun Guha

Intelligent Load Balancing in Cloud Computer Systems

Cloud computing is an established technology allowing users to share resources on a large scale, never before seen in IT history. A cloud system connects multiple individual servers in order to process related tasks in several environments…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-30 Leszek Sliwko

Skedulix: Hybrid Cloud Scheduling for Cost-Efficient Execution of Serverless Applications

We present a framework for scheduling multifunction serverless applications over a hybrid public-private cloud. A set of serverless jobs is input as a batch, and the objective is to schedule function executions over the hybrid platform to…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-09 Anirban Das , Andrew Leaf , Carlos A. Varela , Stacy Patterson

Building Scalable AI-Powered Applications with Cloud Databases: Architectures, Best Practices and Performance Considerations

The rapid adoption of AI-powered applications demands high-performance, scalable, and efficient cloud database solutions, as traditional architectures often struggle with AI-driven workloads requiring real-time data access, vector search,…

Databases · Computer Science 2025-05-06 Santosh Bhupathi

Serverless Query Processing with Flexible Performance SLAs and Prices

Serverless query processing has become increasingly popular due to its auto-scaling, high elasticity, and pay-as-you-go pricing. It allows cloud data warehouse (or lakehouse) users to focus on data analysis without the burden of managing…

Databases · Computer Science 2024-12-24 Haoqiong Bian , Dongyang Geng , Yunpeng Chai , Anastasia Ailamaki

Scientific Workflow Applications on Amazon EC2

The proliferation of commercial cloud computing providers has generated significant interest in the scientific computing community. Much recent research has attempted to determine the benefits and drawbacks of cloud computing for scientific…

Instrumentation and Methods for Astrophysics · Physics 2016-11-15 Gideon Juve , Ewa Deelman , Karan Vahi , Gaurang Mehta , Bruce Berriman , Benjamin P. Berman , Phil Maechling

Big Data Analytics-Enhanced Cloud Computing: Challenges, Architectural Elements, and Future Directions

The emergence of cloud computing has made dynamic provisioning of elastic capacity to applications on-demand. Cloud data centers contain thousands of physical servers hosting orders of magnitude more virtual machines that can be allocated…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-17 Rajkumar Buyya , Kotagiri Ramamohanarao , Chris Leckie , Rodrigo N. Calheiros , Amir Vahid Dastjerdi , Steve Versteeg

Journey of Migrating Millions of Queries on The Cloud

Treasure Data is processing millions of distributed SQL queries every day on the cloud. Upgrading the query engine service at this scale is challenging because we need to migrate all of the production queries of the customers to a new…

Databases · Computer Science 2022-05-19 Taro L. Saito , Naoki Takezoe , Yukihiro Okada , Takako Shimamoto , Dongmin Yu , Suprith Chandrashekharachar , Kai Sasaki , Shohei Okumiya , Yan Wang , Takashi Kurihara , Ryu Kobayashi , Keisuke Suzuki , Zhenghong Yang , Makoto Onizuka

Optimizing Data Lakes' Queries

Cloud data lakes provide a modern solution for managing large volumes of data. The fundamental principle behind these systems is the separation of compute and storage layers. In this architecture, inexpensive cloud storage is utilized for…

Databases · Computer Science 2025-10-20 Gregory , Weintraub

Caching Stars in the Sky: A Semantic Caching Approach to Accelerate Skyline Queries

Multi-criteria decision making has been made possible with the advent of skyline queries. However, processing such queries for high dimensional datasets remains a time consuming task. Real-time applications are thus infeasible, especially…

Databases · Computer Science 2011-06-13 Arnab Bhattacharya , B. Palvali Teja , Sourav Dutta