Related papers: Serving deep learning models in a serverless platf…

A Survey of Serverless Machine Learning Model Inference

Recent developments in Generative AI, Computer Vision, and Natural Language Processing have led to an increased integration of AI models into various products. This widespread adoption of AI requires significant efforts in deploying these…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-23 Kamil Kojs

Enabling Efficient Serverless Inference Serving for LLM (Large Language Model) in the Cloud

This review report discusses the cold start latency in serverless inference and existing solutions. It particularly reviews the ServerlessLLM method, a system designed to address the cold start problem in serverless inference for large…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-26 Himel Ghosh

Distributed Double Machine Learning with a Serverless Architecture

This paper explores serverless cloud computing for double machine learning. Being based on repeated cross-fitting, double machine learning is particularly well suited to exploit the high level of parallelism achievable with serverless…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-21 Malte S. Kurz

Serverless Computing: Behind the Scenes of Major Platforms

Serverless computing offers an event driven pay-as-you-go framework for application development. A key selling point is the concept of no back-end server management, allowing developers to focus on application functionality. This is…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-12-11 Daniel Kelly , Frank G Glavin , Enda Barrett

An Empirical Evaluation of Serverless Cloud Infrastructure for Large-Scale Data Processing

Data processing systems are increasingly deployed in the cloud. While monolithic systems run fully on virtual servers, recent systems embrace cloud infrastructure and utilize the disaggregation of compute and storage to scale them…

Databases · Computer Science 2025-01-15 Thomas Bodner , Theo Radig , David Justen , Daniel Ritter , Tilmann Rabl

Towards Demystifying Serverless Machine Learning Training

The appeal of serverless (FaaS) has triggered a growing interest on how to use it in data-intensive applications such as ETL, query processing, or machine learning (ML). Several systems exist for training large-scale ML models on top of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-18 Jiawei Jiang , Shaoduo Gan , Yue Liu , Fanlin Wang , Gustavo Alonso , Ana Klimovic , Ankit Singla , Wentao Wu , Ce Zhang

Serverless Data Science -- Are We There Yet? A Case Study of Model Serving

Machine learning (ML) is an important part of modern data science applications. Data scientists today have to manage the end-to-end ML life cycle that includes both model training and model serving, the latter of which is essential, as it…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-02 Yuncheng Wu , Tien Tuan Anh Dinh , Guoyu Hu , Meihui Zhang , Yeow Meng Chee , Beng Chin Ooi

Serverless inferencing on Kubernetes

Organisations are increasingly putting machine learning models into production at scale. The increasing popularity of serverless scale-to-zero paradigms presents an opportunity for deploying machine learning models to help mitigate…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-27 Clive Cox , Dan Sun , Ellis Tarn , Animesh Singh , Rakesh Kelkar , David Goodwin

Lambada: Interactive Data Analytics on Cold Data using Serverless Cloud Infrastructure

The promise of ultimate elasticity and operational simplicity of serverless computing has recently lead to an explosion of research in this area. In the context of data analytics, the concept sounds appealing, but due to the limitations of…

Databases · Computer Science 2020-05-11 Ingo Müller , Renato Marroquín , Gustavo Alonso

{\lambda}Scale: Enabling Fast Scaling for Serverless Large Language Model Inference

Serverless computing has emerged as a compelling solution for cloud-based model inference. However, as modern large language models (LLMs) continue to grow in size, existing serverless platforms often face substantial model startup…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-09 Minchen Yu , Rui Yang , Chaobo Jia , Zhaoyuan Su , Sheng Yao , Tingfeng Lan , Yuchen Yang , Zirui Wang , Yue Cheng , Wei Wang , Ao Wang , Ruichuan Chen

Serverless Data Analytics with Flint

Serverless architectures organized around loosely-coupled function invocations represent an emerging design for many applications. Recent work mostly focuses on user-facing products and event-driven processing pipelines. In this paper, we…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-10-11 Youngbin Kim , Jimmy Lin

Performance Evaluation of Serverless Edge Computing for Machine Learning Applications

Next generation technologies such as smart healthcare, self-driving cars, and smart cities require new approaches to deal with the network traffic generated by the Internet of Things (IoT) devices, as well as efficient programming models to…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-20 Quoc Lap Trieu , Bahman Javadi , Jim Basilakis , Adel N. Toosi

A Deep Reinforcement Learning based Algorithm for Time and Cost Optimized Scaling of Serverless Applications

Serverless computing has gained a strong traction in the cloud computing community in recent years. Among the many benefits of this novel computing model, the rapid auto-scaling capability of user applications takes prominence. However, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-23 Anupama Mampage , Shanika Karunasekera , Rajkumar Buyya

Managing Cold-start in The Serverless Cloud with Temporal Convolutional Networks

Serverless cloud is an innovative cloud service model that frees customers from most cloud management duties. It also offers the same advantages as other cloud models but at much lower costs. As a result, the serverless cloud has been…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-12-15 Tam N. Nguyen

The High Cost of Keeping Warm: Characterizing Overhead in Serverless Autoscaling Policies

Serverless computing is transforming cloud application development, but the performance-cost trade-offs of control plane designs remain poorly understood due to a lack of open, cross-platform benchmarks and detailed system analyses. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-04 Leonid Kondrashov , Boxi Zhou , Hancheng Wang , Dmitrii Ustiugov

DeepServe: Serverless Large Language Model Serving at Scale

In this paper, we propose DEEPSERVE, a scalable and serverless AI platform designed to efficiently serve large language models (LLMs) at scale in cloud environments. DEEPSERVE addresses key challenges such as resource allocation, serving…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-10 Junhao Hu , Jiang Xu , Zhixia Liu , Yulong He , Yuetao Chen , Hao Xu , Jiang Liu , Jie Meng , Baoquan Zhang , Shining Wan , Gengyuan Dan , Zhiyu Dong , Zhihao Ren , Changhong Liu , Tao Xie , Dayun Lin , Qin Zhang , Yue Yu , Hao Feng , Xusheng Chen , Yizhou Shan

Taming Cold Starts: Proactive Serverless Scheduling with Model Predictive Control

Serverless computing has transformed cloud application deployment by introducing a fine-grained, event-driven execution model that abstracts away infrastructure management. Its on-demand nature makes it especially appealing for…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-14 Chanh Nguyen , Monowar Bhuyan , Erik Elmroth

Evaluating Serverless Machine Learning Performance on Google Cloud Run

End-users can get functions-as-a-service from serverless platforms, which promise lower hosting costs, high availability, fault tolerance, and dynamic flexibility for hosting individual functions known as microservices. Machine learning…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-25 Prerana Khatiwada , Pranjal Dhakal

Formal Foundations of Serverless Computing

Serverless computing (also known as functions as a service) is a new cloud computing abstraction that makes it easier to write robust, large-scale web services. In serverless computing, programmers write what are called serverless…

Programming Languages · Computer Science 2021-03-12 Abhinav Jangda , Donald Pinckney , Yuriy Brun , Arjun Guha

High-Performance Serverless Computing: A Systematic Literature Review on Serverless for HPC, AI, and Big Data

The widespread deployment of large-scale, compute-intensive applications such as high-performance computing, artificial intelligence, and big data is leading to convergence between cloud and high-performance computing infrastructures. Cloud…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-15 Valerio Besozzi , Matteo Della Bartola , Patrizio Dazzi , Marco Danelutto