Related papers: LLM-Based Misconfiguration Detection for AWS Serve…

SlsReuse: LLM-Powered Serverless Function Reuse

Serverless computing has rapidly emerged as a popular cloud computing paradigm. It enables developers to implement function-level tasks, i.e., serverless functions, without managing infrastructure. While reducing operational overhead, it…

Software Engineering · Computer Science 2025-11-24 Jinfeng Wen , Yuehan Sun

SLAM: SLO-Aware Memory Optimization for Serverless Applications

Serverless computing paradigm has become more ingrained into the industry, as it offers a cheap alternative for application development and deployment. This new paradigm has also created new kinds of problems for the developer, who needs to…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-14 Gor Safaryan , Anshul Jindal , Mohak Chadha , Michael Gerndt

Detection of Compromised Functions in a Serverless Cloud Environment

Serverless computing is an emerging cloud paradigm with serverless functions at its core. While serverless environments enable software developers to focus on developing applications without the need to actively manage the underlying…

Cryptography and Security · Computer Science 2024-08-06 Danielle Lavi , Oleg Brodt , Dudu Mimran , Yuval Elovici , Asaf Shabtai

Silent Failures in Stateless Systems: Rethinking Anomaly Detection for Serverless Computing

Serverless computing has redefined cloud application deployment by abstracting infrastructure and enabling on-demand, event-driven execution, thereby enhancing developer agility and scalability. However, maintaining consistent application…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-01 Chanh Nguyen , Erik Elmroth , Monowar Bhuyan

SMLT: A Serverless Framework for Scalable and Adaptive Machine Learning Design and Training

In today's production machine learning (ML) systems, models are continuously trained, improved, and deployed. ML design and training are becoming a continuous workflow of various tasks that have dynamic resource demands. Serverless…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-05 Ahsan Ali , Syed Zawad , Paarijaat Aditya , Istemi Ekin Akkus , Ruichuan Chen , Feng Yan

Enabling Efficient Serverless Inference Serving for LLM (Large Language Model) in the Cloud

This review report discusses the cold start latency in serverless inference and existing solutions. It particularly reviews the ServerlessLLM method, a system designed to address the cold start problem in serverless inference for large…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-26 Himel Ghosh

Formal Foundations of Serverless Computing

Serverless computing (also known as functions as a service) is a new cloud computing abstraction that makes it easier to write robust, large-scale web services. In serverless computing, programmers write what are called serverless…

Programming Languages · Computer Science 2021-03-12 Abhinav Jangda , Donald Pinckney , Yuriy Brun , Arjun Guha

LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations

Security misconfigurations in Container Orchestrators (COs) can pose serious threats to software systems. While Static Analysis Tools (SATs) can effectively detect these security vulnerabilities, the industry currently lacks automated…

Software Engineering · Computer Science 2025-02-05 Ziyang Ye , Triet Huynh Minh Le , M. Ali Babar

LLM Assisted Anomaly Detection Service for Site Reliability Engineers: Enhancing Cloud Infrastructure Resilience

This paper introduces a scalable Anomaly Detection Service with a generalizable API tailored for industrial time-series data, designed to assist Site Reliability Engineers (SREs) in managing cloud infrastructure. The service enables…

Machine Learning · Computer Science 2025-01-29 Nimesh Jha , Shuxin Lin , Srideepika Jayaraman , Kyle Frohling , Christodoulos Constantinides , Dhaval Patel

MLmisFinder: A Specification and Detection Approach of Machine Learning Service Misuses

Machine Learning (ML) cloud services, offered by leading providers such as Amazon, Google, and Microsoft, enable the integration of ML components into software systems without building models from scratch. However, the rapid adoption of ML…

Software Engineering · Computer Science 2026-03-19 Hadil Ben Amor , Niruthiha Selvanayagam , Manel Abdellatif , Taher A. Ghaleb , Naouel Moha

Network Self-Configuration based on Fine-Tuned Small Language Models

As modern networks grow in scale and complexity, manual configuration becomes increasingly inefficient and prone to human error. While intent-driven self-configuration using large language models has shown significant promise, such models…

Networking and Internet Architecture · Computer Science 2025-12-03 Oscar G. Lira , Oscar M. Caicedo , Nelson L. S. Da Fonseca

Serverless AI Security: Attack Surface Analysis and Runtime Protection Mechanisms for FaaS-Based Machine Learning

Serverless computing has achieved widespread adoption, with over 70% of AWS organizations using serverless solutions [1]. Meanwhile, machine learning inference workloads increasingly migrate to Function-as-a-Service (FaaS) platforms for…

Cryptography and Security · Computer Science 2026-01-21 Chetan Pathade , Vinod Dhimam , Sheheryar Ahmad , Ilsa Lareb

ServerlessLLM: Low-Latency Serverless Inference for Large Language Models

This paper presents ServerlessLLM, a distributed system designed to support low-latency serverless inference for Large Language Models (LLMs). By harnessing the substantial near-GPU storage and memory capacities of inference servers,…

Machine Learning · Computer Science 2024-07-26 Yao Fu , Leyang Xue , Yeqi Huang , Andrei-Octavian Brabete , Dmitrii Ustiugov , Yuvraj Patel , Luo Mai

SmartLLMs Scheduler: A Framework for Cost-Effective LLMs Utilization

Large Language Models (LLMs) such as GPT-4 and Llama have shown remarkable capabilities in a variety of software engineering tasks. Despite the advancements, their practical deployment faces challenges, including high financial costs, long…

Software Engineering · Computer Science 2025-08-06 Yueyue Liu , Hongyu Zhang , Yuantian Miao

LaSS: Running Latency Sensitive Serverless Computations at the Edge

Serverless computing has emerged as a new paradigm for running short-lived computations in the cloud. Due to its ability to handle IoT workloads, there has been considerable interest in running serverless functions at the edge. However, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-03 Bin Wang , Ahmed Ali-Eldin , Prashant Shenoy

Towards Resource-Efficient Serverless LLM Inference with SLINFER

The rise of LLMs has driven demand for private serverless deployments, characterized by moderate-sized models and infrequent requests. While existing serverless solutions follow exclusive GPU allocation, we take a step back to explore…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-16 Chuhao Xu , Zijun Li , Quan Chen , Han Zhao , Xueyan Tang , Minyi Guo

Serving deep learning models in a serverless platform

Serverless computing has emerged as a compelling paradigm for the development and deployment of a wide range of event based cloud applications. At the same time, cloud providers and enterprise companies are heavily adopting machine learning…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-13 Vatche Ishakian , Vinod Muthusamy , Aleksander Slominski

Dynamic Function Configuration and its Management in Serverless Computing: A Taxonomy and Future Directions

The serverless cloud computing model offers a framework where the service provider abstracts the underlying infrastructure management from developers. In this serverless model, FaaS provides an event-driven, function-oriented computing…

Software Engineering · Computer Science 2025-10-06 Siddharth Agarwal , Maria A. Rodriguez , Rajkumar Buyya

SageServe: Optimizing LLM Serving on Cloud Data Centers with Forecast Aware Auto-Scaling

Global cloud service providers handle inference workloads for Large Language Models (LLMs) that span latency-sensitive (e.g., chatbots) and insensitive (e.g., report writing) tasks, resulting in diverse and often conflicting Service Level…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-14 Shashwat Jaiswal , Kunal Jain , Yogesh Simmhan , Anjaly Parayil , Ankur Mallick , Rujia Wang , Renee St. Amant , Chetan Bansal , Victor Rühle , Anoop Kulkarni , Steve Kofsky , Saravan Rajmohan

ALPS: Automated Least-Privilege Enforcement for Securing Serverless Functions

Serverless computing is increasingly adopted for AI-driven workloads due to its automatic scaling and pay-as-you-go model. However, its function-based architecture creates significant security risks, including excessive privilege allocation…

Cryptography and Security · Computer Science 2026-03-27 Changhee Shin , Bom Kim , Seungsoo Lee