Distributed, Parallel, and Cluster Computing · Computer Science
FSD-Inference: Fully Serverless Distributed Inference with Scalable Cloud Communication
Joe Oakley, Hakan Ferhatosmanoglu
2024-03-25
Distributed, Parallel, and Cluster Computing · Computer Science
Towards Resource-Efficient Serverless LLM Inference with SLINFER
Chuhao Xu, Zijun Li, Quan Chen, Han Zhao +2
2025-12-16
Distributed, Parallel, and Cluster Computing · Computer Science
Cost-Performance Analysis: A Comparative Study of CPU-Based Serverless and GPU-Based Training Architectures
Amine Barrak, Fabio Petrillo, Fehmi Jaafar
2025-09-19
Distributed, Parallel, and Cluster Computing · Computer Science
{\lambda}Scale: Enabling Fast Scaling for Serverless Large Language Model Inference
Minchen Yu, Rui Yang, Chaobo Jia, Zhaoyuan Su +8
2026-03-09
Machine Learning · Computer Science
ServerlessLLM: Low-Latency Serverless Inference for Large Language Models
Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete +3
2024-07-26
Distributed, Parallel, and Cluster Computing · Computer Science
A Serverless Architecture for Efficient and Scalable Monte Carlo Markov Chain Computation
Fabio Castagna, Alberto Trombetta, Marco Landoni, Stefano Andreon
2023-10-09
Distributed, Parallel, and Cluster Computing · Computer Science
Serverless Abstractions for Short-Running, Lightweight Streams
Natalie Carl, Niklas Kowallik, Constantin Stahl, Trever Schirmer +2
2026-03-04
Machine Learning · Computer Science
ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs
Yifan Sui, Hao Wang, Hanfei Yu, Yitao Hu +2
2025-05-21
Distributed, Parallel, and Cluster Computing · Computer Science
SMLT: A Serverless Framework for Scalable and Adaptive Machine Learning Design and Training
Ahsan Ali, Syed Zawad, Paarijaat Aditya, Istemi Ekin Akkus +2
2022-05-05
Distributed, Parallel, and Cluster Computing · Computer Science
Efficient Serverless Cold Start: Reducing Library Loading Overhead by Profile-guided Optimization
Syed Salauddin Mohammad Tariq, Ali Al Zein, Soumya Sripad Vaidya, Arati Khanolkar +2
2025-04-29
Distributed, Parallel, and Cluster Computing · Computer Science
SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization
Akrit Mudvari, Yuang Jiang, Leandros Tassiulas
2024-10-17
Distributed, Parallel, and Cluster Computing · Computer Science
Towards Demystifying Intra-Function Parallelism in Serverless Computing
Michael Kiener, Mohak Chadha, Michael Gerndt
2021-11-10
Distributed, Parallel, and Cluster Computing · Computer Science
MLLess: Achieving Cost Efficiency in Serverless Machine Learning Training
Pablo Gimeno Sarroca, Marc Sánchez-Artigas
2022-06-14
Databases · Computer Science
Serverless Query Processing with Flexible Performance SLAs and Prices
Haoqiong Bian, Dongyang Geng, Yunpeng Chai, Anastasia Ailamaki
2024-12-24
Machine Learning · Computer Science
BlendServe: Optimizing Offline Inference for Auto-regressive Large Models with Resource-aware Batching
Yilong Zhao, Shuo Yang, Kan Zhu, Lianmin Zheng +4
2024-11-26
Distributed, Parallel, and Cluster Computing · Computer Science
Supporting Parallelism in Server-based Multiprocessor Systems
Luís Nogueira, Luís Miguel Pinho
2011-06-15
Distributed, Parallel, and Cluster Computing · Computer Science
Multi-model Machine Learning Inference Serving with GPU Spatial Partitioning
Seungbeom Choi, Sunho Lee, Yeonjae Kim, Jongse Park +2
2021-09-06
Distributed, Parallel, and Cluster Computing · Computer Science
Serverless Data Science -- Are We There Yet? A Case Study of Model Serving
Yuncheng Wu, Tien Tuan Anh Dinh, Guoyu Hu, Meihui Zhang +2
2022-03-02
Distributed, Parallel, and Cluster Computing · Computer Science
A Language-based Serverless Function Accelerator
Emily Herbert, Arjun Guha
2020-08-05
Distributed, Parallel, and Cluster Computing · Computer Science
The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution
Frank Sifei Luan, Ron Yifeng Wang, Yile Gu, Ziming Mao +11
2025-10-23