Related papers: Optimal Choice of Threshold in Two Level Processor…
We study Batch Processor-Sharing (BPS) queuing model with hyper-exponential service time distribution and Poisson batch arrival process. One of the main goals to study BPS is the possibility of its application in size-based scheduling,…
We consider the job assignment problem in a multi-server system consisting of $N$ parallel processor sharing servers, categorized into $M$ ($\ll N$) different types according to their processing capacity or speed. Jobs of random sizes…
Work sharing and work stealing are two scheduling paradigms to redistribute work when performing distributed computations. In work sharing, processors attempt to migrate pending jobs to other processors in the hope of reducing response…
The paper considers a queueing system with limited processor sharing. No more than n jobs may be served simultaneously. This system may be used for modeling bandwidth sharing in wireless communication systems and processes of service in…
Modern computing workloads are often composed of parallelizable jobs. A parallelizable job can be completed more quickly when run on additional servers. However, each job can only use a limited number of servers, known as its…
Recent development of peer-to-peer (P2P) services (e.g. streaming, file sharing, and storage) systems introduces a new type of queue systems that receive little attention before, where both job and server arrive and depart randomly. Current…
We consider a system of parallel queues where tasks are assigned (dispatched) to one of the available servers upon arrival. The dispatching decision is based on the full state information, i.e., on the sizes of the new and existing jobs. We…
We study the conditional sojourn time distributions of processor sharing (PS), foreground background processor sharing (FBPS) and shortest remaining processing time first (SRPT) scheduling disciplines on an event where the job size of a…
We analyze randomized dynamic load balancing schemes for multi-server processor sharing systems when the number of servers in the system is large and the servers have heterogeneous service rates. In particular, we focus on the classical…
The standard setting for studying parallel server systems (PSS) at the diffusion scale is based on the heavy traffic condition (HTC), which assumes that the underlying static allocation linear program (LP) is critical and has a unique…
Deep neural networks training jobs and other iterative computations frequently include checkpoints where jobs can be canceled based on the current value of monitored metrics. While most of existing results focus on the performance of all…
The paper studies approximations and control of a processor sharing (PS) server where the service rate depends on the number of jobs occupying the server. The control of such a system is implemented by imposing a limit on the number of jobs…
We show that the distribution of supercomputer job submission interarrival times can be understood as a relaxation process. The process of deciding when to submit a job involves a complicated set of interactions between the users…
We investigate a computer network consisting of two layers occurring in, for example, application servers. The first layer incorporates the arrival of jobs at a network of multi-server nodes, which we model as a many-server Jackson network.…
Distributed opportunistic scheduling (DOS) is studied for wireless ad-hoc networks in which many links contend for the channel using random access before data transmissions. Simply put, DOS involves a process of joint channel probing and…
Staffing rules are an essential management tool in service industries for meeting target service levels. The square-root safety rule, based on the Poisson arrival assumption, has been commonly used. However, empirical findings suggest that…
Scheduling is an important task allowing parallel systems to perform efficiently and reliably. For modern computation systems, divisible load is a special type of data which can be divided into arbitrary sizes and independently processed in…
We consider a computation offloading system where jobs are processed sequentially at a local server followed by a higher-capacity cloud server. The system offers two service modes, differing in how the processing is split between the…
Models of parallel processing systems typically assume that one has $l$ workers and jobs are split into an equal number of $k=l$ tasks. Splitting jobs into $k > l$ smaller tasks, i.e. using ``tiny tasks'', can yield performance and…
The parallel execution of requests in a Cloud Computing platform, as for Virtualized Network Functions, is modeled by an $M^{[X]}/M/1$ Processor-Sharing (PS) system, where each request is seen as a batch of unit jobs. The performance of…