Related papers: LibProf: A Python Profiler for Improving Cold Star…
Serverless computing abstracts away server management, enabling automatic scaling, efficient resource utilization, and cost-effective pricing models. However, despite these advantages, it faces the significant challenge of cold-start…
Python has become a popular programming language because of its excellent programmability. Many modern software packages utilize Python for high-level algorithm design and depend on native libraries written in C/C++/Fortran for efficient…
Serverless computing simplifies deployment and scaling, yet cold-start latency remains a major performance bottleneck. Unlike prior work that treats mitigation as a black-box optimization, we study cold starts as a developer-visible design…
As an emerging cloud computing deployment paradigm, serverless computing is gaining traction due to its efficiency and ability to harness on-demand cloud resources. However, a significant hurdle remains in the form of the cold start…
Profiling tools (also known as profilers) play an important role in understanding program performance at runtime, such as hotspots, bottlenecks, and inefficiencies. While profilers have been proven to be useful, they give extra burden to…
Serverless computing has transformed cloud application deployment by introducing a fine-grained, event-driven execution model that abstracts away infrastructure management. Its on-demand nature makes it especially appealing for…
Memory profiling captures programs' dynamic memory behavior, assisting programmers in debugging, tuning, and enabling advanced compiler optimizations like speculation-based automatic parallelization. As each use case demands its unique…
Recently, academics and the corporate sector have paid attention to serverless computing, which enables dynamic scalability and an economic model. In serverless computing, users only pay for the time they actually use resources, enabling…
Input-sensitive profiling is a recent performance analysis technique that makes it possible to estimate the empirical cost function of individual routines of a program, helping developers understand how performance scales to larger inputs…
This paper presents the Container Profiler, a software tool that measures and records the resource usage of any containerized task. Our tool profiles the CPU, memory, disk, and network utilization of containerized tasks collecting over…
Serverless computing has gained popularity due to its cost efficiency, ease of deployment, and enhanced scalability. However, in serverless environments, servers are initiated only after receiving a request, leading to increased response…
The key to speeding up applications is often understanding where the elapsed time is spent, and why. This document reviews in depth the full array of performance analysis tools and techniques available on Linux for this task, from the…
This review report discusses the cold start latency in serverless inference and existing solutions. It particularly reviews the ServerlessLLM method, a system designed to address the cold start problem in serverless inference for large…
The rise of LLMs has driven demand for private serverless deployments, characterized by moderate-sized models and infrequent requests. While existing serverless solutions follow exclusive GPU allocation, we take a step back to explore…
While high-level languages come with significant readability and maintainability benefits, their performance remains difficult to predict. For example, programmers may unknowingly use language features inappropriately, which cause their…
Serverless computing has emerged as a pivotal paradigm for deploying Deep Learning (DL) models, offering automatic scaling and cost efficiency. However, the inherent cold start problem in serverless ML inference systems, particularly the…
Diagnosing performance bottlenecks in modern software is essential yet challenging, particularly as applications become more complex and rely on custom resource management policies. While traditional profilers effectively identify execution…
This paper proposes Scalene, a profiler specialized for Python. Scalene combines a suite of innovations to precisely and simultaneously profile CPU, memory, and GPU usage, all with low overhead. Scalene's CPU and memory profilers help…
Automated debugging, long pursued in a variety of fields from software engineering to cybersecurity, requires a framework that offers the building blocks for a programmable debugging workflow. However, existing debuggers are primarily…
Datacenters are witnessing a rapid surge in the adoption of serverless functions for microservices-based applications. A vast majority of these microservices typically span less than a second, have strict SLO requirements, and are chained…