English
Related papers

Related papers: Serving deep learning models in a serverless platf…

200 papers

Recent developments in Generative AI, Computer Vision, and Natural Language Processing have led to an increased integration of AI models into various products. This widespread adoption of AI requires significant efforts in deploying these…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-23 Kamil Kojs

This review report discusses the cold start latency in serverless inference and existing solutions. It particularly reviews the ServerlessLLM method, a system designed to address the cold start problem in serverless inference for large…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-26 Himel Ghosh

This paper explores serverless cloud computing for double machine learning. Being based on repeated cross-fitting, double machine learning is particularly well suited to exploit the high level of parallelism achievable with serverless…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-21 Malte S. Kurz

Serverless computing offers an event driven pay-as-you-go framework for application development. A key selling point is the concept of no back-end server management, allowing developers to focus on application functionality. This is…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-12-11 Daniel Kelly , Frank G Glavin , Enda Barrett

Data processing systems are increasingly deployed in the cloud. While monolithic systems run fully on virtual servers, recent systems embrace cloud infrastructure and utilize the disaggregation of compute and storage to scale them…

Databases · Computer Science 2025-01-15 Thomas Bodner , Theo Radig , David Justen , Daniel Ritter , Tilmann Rabl

The appeal of serverless (FaaS) has triggered a growing interest on how to use it in data-intensive applications such as ETL, query processing, or machine learning (ML). Several systems exist for training large-scale ML models on top of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-18 Jiawei Jiang , Shaoduo Gan , Yue Liu , Fanlin Wang , Gustavo Alonso , Ana Klimovic , Ankit Singla , Wentao Wu , Ce Zhang

Machine learning (ML) is an important part of modern data science applications. Data scientists today have to manage the end-to-end ML life cycle that includes both model training and model serving, the latter of which is essential, as it…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-02 Yuncheng Wu , Tien Tuan Anh Dinh , Guoyu Hu , Meihui Zhang , Yeow Meng Chee , Beng Chin Ooi

Organisations are increasingly putting machine learning models into production at scale. The increasing popularity of serverless scale-to-zero paradigms presents an opportunity for deploying machine learning models to help mitigate…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-27 Clive Cox , Dan Sun , Ellis Tarn , Animesh Singh , Rakesh Kelkar , David Goodwin

The promise of ultimate elasticity and operational simplicity of serverless computing has recently lead to an explosion of research in this area. In the context of data analytics, the concept sounds appealing, but due to the limitations of…

Databases · Computer Science 2020-05-11 Ingo Müller , Renato Marroquín , Gustavo Alonso

Serverless computing has emerged as a compelling solution for cloud-based model inference. However, as modern large language models (LLMs) continue to grow in size, existing serverless platforms often face substantial model startup…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-09 Minchen Yu , Rui Yang , Chaobo Jia , Zhaoyuan Su , Sheng Yao , Tingfeng Lan , Yuchen Yang , Zirui Wang , Yue Cheng , Wei Wang , Ao Wang , Ruichuan Chen

Serverless architectures organized around loosely-coupled function invocations represent an emerging design for many applications. Recent work mostly focuses on user-facing products and event-driven processing pipelines. In this paper, we…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-10-11 Youngbin Kim , Jimmy Lin

Next generation technologies such as smart healthcare, self-driving cars, and smart cities require new approaches to deal with the network traffic generated by the Internet of Things (IoT) devices, as well as efficient programming models to…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-20 Quoc Lap Trieu , Bahman Javadi , Jim Basilakis , Adel N. Toosi

Serverless computing has gained a strong traction in the cloud computing community in recent years. Among the many benefits of this novel computing model, the rapid auto-scaling capability of user applications takes prominence. However, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-23 Anupama Mampage , Shanika Karunasekera , Rajkumar Buyya

Serverless cloud is an innovative cloud service model that frees customers from most cloud management duties. It also offers the same advantages as other cloud models but at much lower costs. As a result, the serverless cloud has been…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-12-15 Tam N. Nguyen

Serverless computing is transforming cloud application development, but the performance-cost trade-offs of control plane designs remain poorly understood due to a lack of open, cross-platform benchmarks and detailed system analyses. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-04 Leonid Kondrashov , Boxi Zhou , Hancheng Wang , Dmitrii Ustiugov

In this paper, we propose DEEPSERVE, a scalable and serverless AI platform designed to efficiently serve large language models (LLMs) at scale in cloud environments. DEEPSERVE addresses key challenges such as resource allocation, serving…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-10 Junhao Hu , Jiang Xu , Zhixia Liu , Yulong He , Yuetao Chen , Hao Xu , Jiang Liu , Jie Meng , Baoquan Zhang , Shining Wan , Gengyuan Dan , Zhiyu Dong , Zhihao Ren , Changhong Liu , Tao Xie , Dayun Lin , Qin Zhang , Yue Yu , Hao Feng , Xusheng Chen , Yizhou Shan

Serverless computing has transformed cloud application deployment by introducing a fine-grained, event-driven execution model that abstracts away infrastructure management. Its on-demand nature makes it especially appealing for…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-14 Chanh Nguyen , Monowar Bhuyan , Erik Elmroth

End-users can get functions-as-a-service from serverless platforms, which promise lower hosting costs, high availability, fault tolerance, and dynamic flexibility for hosting individual functions known as microservices. Machine learning…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-25 Prerana Khatiwada , Pranjal Dhakal

Serverless computing (also known as functions as a service) is a new cloud computing abstraction that makes it easier to write robust, large-scale web services. In serverless computing, programmers write what are called serverless…

Programming Languages · Computer Science 2021-03-12 Abhinav Jangda , Donald Pinckney , Yuriy Brun , Arjun Guha

The widespread deployment of large-scale, compute-intensive applications such as high-performance computing, artificial intelligence, and big data is leading to convergence between cloud and high-performance computing infrastructures. Cloud…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-15 Valerio Besozzi , Matteo Della Bartola , Patrizio Dazzi , Marco Danelutto
‹ Prev 1 2 3 10 Next ›