Related papers: Zero Queueing for Multi-Server Jobs

Stability for Two-class Multiserver-job Systems

Multiserver-job systems, where jobs require concurrent service at many servers, occur widely in practice. Much is known in the dropping setting, where jobs are immediately discarded if they require more servers than are currently available.…

Performance · Computer Science 2020-10-05 Isaac Grosof , Mor Harchol-Balter , Alan Scheller-Wolf

Multiserver-job Response Time under Multilevel Scaling

We study the multiserver-job setting in the load-focused multilevel scaling limit, where system load approaches capacity much faster than the growth of the number of servers $n$. We consider the ``1 and $n$'' system, where each job requires…

Performance · Computer Science 2026-04-01 Isaac Grosof , Hayriye Ayhan

Sharp Waiting-Time Bounds for Multiserver Jobs

Multiserver jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today's computing clusters. But little is known about the delay performance of systems with multiserver jobs. We consider queueing…

Performance · Computer Science 2023-04-17 Yige Hong , Weina Wang

Balanced Splitting: A Framework for Achieving Zero-wait in the Multiserver-job Model

We present a new framework for designing nonpreemptive and job-size oblivious scheduling policies in the multiserver-job queueing model. The main requirement is to identify a static and balanced sub-partition of the server set and ensure…

Performance · Computer Science 2025-02-04 Jonatha Anselmi , Josu Doncel

Learning While Scheduling in Multi-Server Systems with Unknown Statistics: MaxWeight with Discounted UCB

Multi-server queueing systems are widely used models for job scheduling in machine learning, wireless networks, crowdsourcing, and healthcare systems. This paper considers a multi-server system with multiple servers and multiple types of…

Machine Learning · Computer Science 2023-06-05 Zixian Yang , R. Srikant , Lei Ying

On Universal Scaling of Distributed Queues under Load Balancing

This paper considers the steady-state performance of load balancing algorithms in a many-server system with distributed queues. The system has $N$ servers, and each server maintains a local queue with buffer size $b-1,$ i.e. a server can…

Probability · Mathematics 2019-12-30 Xin Liu , Lei Ying

Large-scale parallel server system with multi-component jobs

A broad class of parallel server systems is considered, for which we prove the steady-state asymptotic independence of server workloads, as the number of servers goes to infinity, while the system load remains sub-critical. Arriving jobs…

Probability · Mathematics 2020-12-21 Seva Shneer , Alexander Stolyar

Economies-of-scale in resource sharing systems: tutorial and partial review of the QED heavy-traffic regime

Multi-server queueing systems describe situations in which users require service from multiple parallel servers. Examples include check-in lines at airports, waiting rooms in hospitals, queues in contact centers, data buffers in wireless…

Probability · Mathematics 2019-07-30 Johan S. H. van Leeuwaarden , Britt W. J. Mathijsen , Bert Zwart

Tackling Heterogeneous Traffic in Multi-access Systems via Erasure Coded Servers

Most data generated by modern applications is stored in the cloud, and there is an exponential growth in the volume of jobs to access these data and perform computations using them. The volume of data access or computing jobs can be…

Performance · Computer Science 2022-08-16 Tuhinangshu Choudhury , Weina Wang , Gauri Joshi

QoS-Driven Job Scheduling: Multi-Tier Dependency Considerations

For a cloud service provider, delivering optimal system performance while fulfilling Quality of Service (QoS) obligations is critical for maintaining a viably profitable business. This goal is often hard to attain given the irregular nature…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-14 Husam Suleiman , Otman Basir

Service Level Driven Job Scheduling in Multi-Tier Cloud Computing: A Biologically Inspired Approach

Cloud computing environments often have to deal with random-arrival computational workloads that vary in resource requirements and demand high Quality of Service (QoS) obligations. It is typical that a Service-Level-Agreement (SLA) is…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-21 Husam Suleiman , Otman Basir

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, disk, etc.). Jobs with different rewards arrive over time and require certain amounts of…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-29 Konstantinos Psychas , Javad Ghaderi

Improving Multiresource Job Scheduling with Markovian Service Rate Policies

Modern cloud computing workloads are composed of multiresource jobs that require a variety of computational resources in order to run, such as CPU cores, memory, disk space, or hardware accelerators. A single cloud server can typically run…

Performance · Computer Science 2025-05-05 Zhongrui Chen , Isaac Grosof , Benjamin Berg

Improving Multiresource Job Scheduling with Markovian Service Rate Policies

Modern cloud computing workloads are composed of multiresource jobs that require a variety of computational resources in order to run, such as CPU cores, memory, disk space, or hardware accelerators. A single cloud server can typically run…

Performance · Computer Science 2025-04-18 Zhongrui Chen , Isaac Grosof , Benjamin Berg

Effective Handling of Urgent Jobs - Speed Up Scheduling for Computing Applications

A queue is required when a service provider is not able to handle jobs arriving over the time. In a highly flexible and dynamic environment, some jobs might demand for faster execution at run-time especially when the resources are limited…

Performance · Computer Science 2015-03-24 Yash Gupta , Kamalakar Karlapalem

On Stability and Sojourn Time of Peer-to-Peer Queuing Systems

Recent development of peer-to-peer (P2P) services (e.g. streaming, file sharing, and storage) systems introduces a new type of queue systems that receive little attention before, where both job and server arrive and depart randomly. Current…

Performance · Computer Science 2016-05-11 Taoyu Li , Minghua Chen , Tony Lee , Xing Li

On Optimal Server Allocation for Moldable Jobs with Concave Speed-Up

A large proportion of jobs submitted to modern computing clusters and data centers are parallelizable and capable of running on a flexible number of computing cores or servers. Although allocating more servers to such a job results in a…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-17 Samira Ghanbarian , Arpan Mukhopadhyay , Ravi R. Mazumdar , Fabrice M. Guillemin

Parallel queues with synchronization

Motivated by the growing interest in today's massive parallel computing capabilities we analyze a queueing network with many servers in parallel to which jobs arrive a according to a Poisson process. Each job, upon arrival, is split into…

Probability · Mathematics 2015-07-20 Mariana Olvera-Cravioto , Octavio Ruiz-Lacedelli

On the Performance of Large Loss Systems with Adaptive Multiserver Jobs

In this paper, we study systems where each job or request can be split into a flexible number of sub-jobs up to a maximum limit. The number of sub-jobs a job is split into depends on the number of available servers found upon its arrival.…

Probability · Mathematics 2023-09-04 Samira Ghanbarian , Arpan Mukhopadhyay , Fabrice M. Guillemin , Ravi R. Mazumdar

Optimal Scheduling in the Multiserver-job Model under Heavy Traffic

Multiserver-job systems, where jobs require concurrent service at many servers, occur widely in practice. Essentially all of the theoretical work on multiserver-job systems focuses on maximizing utilization, with almost nothing known about…

Performance · Computer Science 2022-11-08 Isaac Grosof , Ziv Scully , Mor Harchol-Balter , Alan Scheller-Wolf