Related papers: Zero Queueing for Multi-Server Jobs
Multiserver-job systems, where jobs require concurrent service at many servers, occur widely in practice. Much is known in the dropping setting, where jobs are immediately discarded if they require more servers than are currently available.…
We study the multiserver-job setting in the load-focused multilevel scaling limit, where system load approaches capacity much faster than the growth of the number of servers $n$. We consider the ``1 and $n$'' system, where each job requires…
Multiserver jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today's computing clusters. But little is known about the delay performance of systems with multiserver jobs. We consider queueing…
We present a new framework for designing nonpreemptive and job-size oblivious scheduling policies in the multiserver-job queueing model. The main requirement is to identify a static and balanced sub-partition of the server set and ensure…
Multi-server queueing systems are widely used models for job scheduling in machine learning, wireless networks, crowdsourcing, and healthcare systems. This paper considers a multi-server system with multiple servers and multiple types of…
This paper considers the steady-state performance of load balancing algorithms in a many-server system with distributed queues. The system has $N$ servers, and each server maintains a local queue with buffer size $b-1,$ i.e. a server can…
A broad class of parallel server systems is considered, for which we prove the steady-state asymptotic independence of server workloads, as the number of servers goes to infinity, while the system load remains sub-critical. Arriving jobs…
Multi-server queueing systems describe situations in which users require service from multiple parallel servers. Examples include check-in lines at airports, waiting rooms in hospitals, queues in contact centers, data buffers in wireless…
Most data generated by modern applications is stored in the cloud, and there is an exponential growth in the volume of jobs to access these data and perform computations using them. The volume of data access or computing jobs can be…
For a cloud service provider, delivering optimal system performance while fulfilling Quality of Service (QoS) obligations is critical for maintaining a viably profitable business. This goal is often hard to attain given the irregular nature…
Cloud computing environments often have to deal with random-arrival computational workloads that vary in resource requirements and demand high Quality of Service (QoS) obligations. It is typical that a Service-Level-Agreement (SLA) is…
We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, disk, etc.). Jobs with different rewards arrive over time and require certain amounts of…
Modern cloud computing workloads are composed of multiresource jobs that require a variety of computational resources in order to run, such as CPU cores, memory, disk space, or hardware accelerators. A single cloud server can typically run…
Modern cloud computing workloads are composed of multiresource jobs that require a variety of computational resources in order to run, such as CPU cores, memory, disk space, or hardware accelerators. A single cloud server can typically run…
A queue is required when a service provider is not able to handle jobs arriving over the time. In a highly flexible and dynamic environment, some jobs might demand for faster execution at run-time especially when the resources are limited…
Recent development of peer-to-peer (P2P) services (e.g. streaming, file sharing, and storage) systems introduces a new type of queue systems that receive little attention before, where both job and server arrive and depart randomly. Current…
A large proportion of jobs submitted to modern computing clusters and data centers are parallelizable and capable of running on a flexible number of computing cores or servers. Although allocating more servers to such a job results in a…
Motivated by the growing interest in today's massive parallel computing capabilities we analyze a queueing network with many servers in parallel to which jobs arrive a according to a Poisson process. Each job, upon arrival, is split into…
In this paper, we study systems where each job or request can be split into a flexible number of sub-jobs up to a maximum limit. The number of sub-jobs a job is split into depends on the number of available servers found upon its arrival.…
Multiserver-job systems, where jobs require concurrent service at many servers, occur widely in practice. Essentially all of the theoretical work on multiserver-job systems focuses on maximizing utilization, with almost nothing known about…