Related papers: HotSwap: Enabling Live Dependency Sharing in Serve…

Caching Aided Multi-Tenant Serverless Computing

One key to enabling high-performance serverless computing is to mitigate cold-starts. Current solutions utilize a warm pool to keep function alive: a warm-start can be analogous to a CPU cache-hit. However, modern cache has multiple…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-05 Chu Qiao , Cong Wang , Zhenkai Zhang , Yuede Ji , Xing Gao

Performance Evaluation of Snapshot Methods to Warm the Serverless Cold Start

The serverless computing model strengthens the cloud computing tendency to abstract resource management. Serverless platforms are responsible for deploying and scaling the developer's applications. Serverless also incorporated the…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-31 Paulo Silva , Thiago Emmanuel Pereira

The High Cost of Keeping Warm: Characterizing Overhead in Serverless Autoscaling Policies

Serverless computing is transforming cloud application development, but the performance-cost trade-offs of control plane designs remain poorly understood due to a lack of open, cross-platform benchmarks and detailed system analyses. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-04 Leonid Kondrashov , Boxi Zhou , Hancheng Wang , Dmitrii Ustiugov

Cross-Edge Orchestration of Serverless Functions with Probabilistic Caching

Serverless edge computing adopts an event-based paradigm that provides back-end services on an as-used basis, resulting in efficient resource utilization. To improve the end-to-end latency and revenue, service providers need to optimize the…

Networking and Internet Architecture · Computer Science 2023-10-09 Chen Chen , Manuel Herrera , Ge Zheng , Liqiao Xia , Zhengyang Ling , Jiangtao Wang

SPES: Towards Optimizing Performance-Resource Trade-Off for Serverless Functions

As an emerging cloud computing deployment paradigm, serverless computing is gaining traction due to its efficiency and ability to harness on-demand cloud resources. However, a significant hurdle remains in the form of the cold start…

Software Engineering · Computer Science 2024-08-22 Cheryl Lee , Zhouruixing Zhu , Tianyi Yang , Yintong Huo , Yuxin Su , Pinjia He , Michael R. Lyu

Truffle: Efficient Data Passing for Data-Intensive Serverless Workflows in the Edge-Cloud Continuum

Serverless computing promises a scalable, reliable, and cost-effective solution for running data-intensive applications and workflows in the heterogeneous and limited-resource environment of the Edge-Cloud Continuum. However, building and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-02 Cynthia Marcelino , Stefan Nastic

Benchmarking, Analysis, and Optimization of Serverless Function Snapshots

Serverless computing has seen rapid adoption due to its high scalability and flexible, pay-as-you-go billing model. In serverless, developers structure their services as a collection of functions, sporadically invoked by various events like…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-02-09 Dmitrii Ustiugov , Plamen Petrov , Marios Kogias , Edouard Bugnion , Boris Grot

HoLiSwap: Reducing Wire Energy in L1 Caches

This paper describes HoLiSwap a method to reduce L1 cache wire energy, a significant fraction of total cache energy, by swapping hot lines to the cache way nearest to the processor. We observe that (i) a small fraction (<3%) of cache lines…

Hardware Architecture · Computer Science 2017-01-17 Yatish Turakhia , Subhasis Das , Tor M. Aamodt , William J. Dally

Waltz: Temperature-Aware Cooperative Compression for High-Performance Compression-Based CSDs

Data compression is widely adopted for modern solid-state drives (SSDs) to mitigate both storage capacity and SSD lifetime issues. Researchers have proposed compression schemes at different system layers, including device-side solutions…

Performance · Computer Science 2025-09-09 Dingcui Yu , Yunpeng Song , Yiyang Huang , Yumiao Zhao , Yina Lv , Chundong Wang , Youtao Zhang , Liang Shi

Canvas: Isolated and Adaptive Swapping for Multi-Applications on Remote Memory

Remote memory techniques for datacenter applications have recently gained a great deal of popularity. Existing remote memory techniques focus on the efficiency of a single application setting only. However, when multiple applications co-run…

Operating Systems · Computer Science 2022-10-13 Chenxi Wang , Yifan Qiao , Haoran Ma , Shi Liu , Yiying Zhang , Wenguang Chen , Ravi Netravali , Miryung Kim , Guoqing Harry Xu

Transformer-Based Model for Cold Start Mitigation in FaaS Architecture

Serverless architectures, particularly the Function as a Service (FaaS) model, have become a cornerstone of modern cloud computing due to their ability to simplify resource management and enhance application deployment agility. However, a…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-16 Alexandre Savi Fayam Mbala Mouen , Jerry Lacmou Zeutouo , Vianney Kengne Tchendji

Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUs

IoT applications increasingly rely on on-device AI accelerators to ensure high performance, especially in low-connectivity and safety-critical scenarios. However, the limited on-chip memory of these accelerators forces inference runtimes to…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-13 Nathan Ng , Walid A. Hanafy , Prashanthi Kadambi , Balachandra Sunil , Ayush Gupta , David Irwin , Yogesh Simmhan , Prashant Shenoy

Improving Nonpreemptive Multiserver Job Scheduling with Quickswap

Modern data center workloads are composed of multiserver jobs, computational jobs that require multiple servers in order to run. A data center server can run many multiserver jobs in parallel, as long as it has sufficient resources to meet…

Performance · Computer Science 2025-11-14 Zhongrui Chen , Adityo Anggraito , Diletta Olliaro , Andrea Marin , Marco Ajmone Marsan , Benjamin Berg , Isaac Grosof

FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0

Service liquidity across edge-to-cloud or multi-cloud will serve as the cornerstone of the next generation of cloud computing systems (Cloud 2.0). Provided that cloud-based services are predominantly containerized, an efficient and robust…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-02 Sorawit Manatura , Thanawat Chanikaphon , Chantana Chantrapornchai , Mohsen Amini Salehi

Revisiting Swapping in User-space with Lightweight Threading

Memory-intensive applications, such as in-memory databases, caching systems and key-value stores, are increasingly demanding larger main memory to fit their working sets. Conventional swapping can enlarge the memory capacity by paging out…

Operating Systems · Computer Science 2021-07-30 Kan Zhong , Wenlin Cui , Youyou Lu , Quanzhang Liu , Xiaodan Yan , Qizhao Yuan , Siwei Luo , Keji Huang

Serverless Computing: Behind the Scenes of Major Platforms

Serverless computing offers an event driven pay-as-you-go framework for application development. A key selling point is the concept of no back-end server management, allowing developers to focus on application functionality. This is…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-12-11 Daniel Kelly , Frank G Glavin , Enda Barrett

Data-driven scheduling in serverless computing to reduce response time

In Function as a Service (FaaS), a serverless computing variant, customers deploy functions instead of complete virtual machines or Linux containers. It is the cloud provider who maintains the runtime environment for these functions. FaaS…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-16 Bartłomiej Przybylski , Paweł Żuk , Krzysztof Rzadca

Hibernate Container: A Deflated Container Mode for Fast Startup and High-density Deployment in Serverless Computing

Serverless computing is a popular cloud computing paradigm, which requires low response latency to handle on-demand user requests. There are two prominent techniques employed for reducing the response latency: keep fully initialized…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-19 Yulin Sun , Deepak Vij , Fenge Li , Wenjian Guo , Ying Xiong

Flexible Swapping for the Cloud

Memory has become the primary cost driver in cloud data centers. Yet, a significant portion of memory allocated to VMs in public clouds remains unused. To optimize this resource, "cold" memory can be reclaimed from VMs and stored on slower…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-23 Milan Pandurov , Lukas Humbel , Dmitry Sepp , Adamos Ttofari , Leon Thomm , Do Le Quoc , Siddharth Chandrasekaran , Sharan Santhanam , Chuan Ye , Shai Bergman , Wei Wang , Sven Lundgren , Konstantinos Sagonas , Alberto Ros

FastCap: An Efficient and Fair Algorithm for Power Capping in Many-Core Systems

Future servers will incorporate many active lowpower modes for different system components, such as cores and memory. Though these modes provide flexibility for power management via Dynamic Voltage and Frequency Scaling (DVFS), they must be…

Performance · Computer Science 2016-03-07 Yanpei Liu , Guilherme Cox , Qingyuan Deng , Stark C. Draper , Ricardo Bianchini