Related papers: HotSwap: Enabling Live Dependency Sharing in Serve…
One key to enabling high-performance serverless computing is to mitigate cold-starts. Current solutions utilize a warm pool to keep function alive: a warm-start can be analogous to a CPU cache-hit. However, modern cache has multiple…
The serverless computing model strengthens the cloud computing tendency to abstract resource management. Serverless platforms are responsible for deploying and scaling the developer's applications. Serverless also incorporated the…
Serverless computing is transforming cloud application development, but the performance-cost trade-offs of control plane designs remain poorly understood due to a lack of open, cross-platform benchmarks and detailed system analyses. In this…
Serverless edge computing adopts an event-based paradigm that provides back-end services on an as-used basis, resulting in efficient resource utilization. To improve the end-to-end latency and revenue, service providers need to optimize the…
As an emerging cloud computing deployment paradigm, serverless computing is gaining traction due to its efficiency and ability to harness on-demand cloud resources. However, a significant hurdle remains in the form of the cold start…
Serverless computing promises a scalable, reliable, and cost-effective solution for running data-intensive applications and workflows in the heterogeneous and limited-resource environment of the Edge-Cloud Continuum. However, building and…
Serverless computing has seen rapid adoption due to its high scalability and flexible, pay-as-you-go billing model. In serverless, developers structure their services as a collection of functions, sporadically invoked by various events like…
This paper describes HoLiSwap a method to reduce L1 cache wire energy, a significant fraction of total cache energy, by swapping hot lines to the cache way nearest to the processor. We observe that (i) a small fraction (<3%) of cache lines…
Data compression is widely adopted for modern solid-state drives (SSDs) to mitigate both storage capacity and SSD lifetime issues. Researchers have proposed compression schemes at different system layers, including device-side solutions…
Remote memory techniques for datacenter applications have recently gained a great deal of popularity. Existing remote memory techniques focus on the efficiency of a single application setting only. However, when multiple applications co-run…
Serverless architectures, particularly the Function as a Service (FaaS) model, have become a cornerstone of modern cloud computing due to their ability to simplify resource management and enhance application deployment agility. However, a…
IoT applications increasingly rely on on-device AI accelerators to ensure high performance, especially in low-connectivity and safety-critical scenarios. However, the limited on-chip memory of these accelerators forces inference runtimes to…
Modern data center workloads are composed of multiserver jobs, computational jobs that require multiple servers in order to run. A data center server can run many multiserver jobs in parallel, as long as it has sufficient resources to meet…
Service liquidity across edge-to-cloud or multi-cloud will serve as the cornerstone of the next generation of cloud computing systems (Cloud 2.0). Provided that cloud-based services are predominantly containerized, an efficient and robust…
Memory-intensive applications, such as in-memory databases, caching systems and key-value stores, are increasingly demanding larger main memory to fit their working sets. Conventional swapping can enlarge the memory capacity by paging out…
Serverless computing offers an event driven pay-as-you-go framework for application development. A key selling point is the concept of no back-end server management, allowing developers to focus on application functionality. This is…
In Function as a Service (FaaS), a serverless computing variant, customers deploy functions instead of complete virtual machines or Linux containers. It is the cloud provider who maintains the runtime environment for these functions. FaaS…
Serverless computing is a popular cloud computing paradigm, which requires low response latency to handle on-demand user requests. There are two prominent techniques employed for reducing the response latency: keep fully initialized…
Memory has become the primary cost driver in cloud data centers. Yet, a significant portion of memory allocated to VMs in public clouds remains unused. To optimize this resource, "cold" memory can be reclaimed from VMs and stored on slower…
Future servers will incorporate many active lowpower modes for different system components, such as cores and memory. Though these modes provide flexibility for power management via Dynamic Voltage and Frequency Scaling (DVFS), they must be…