Related papers: Optimal Hyper-Scalable Load Balancing with a Stric…
A fundamental challenge in large-scale networked systems viz., data centers and cloud networks is to distribute tasks to a pool of servers, using minimal instantaneous state information, while providing excellent delay performance. In this…
A fundamental challenge in large-scale cloud networks and data centers is to achieve highly efficient server utilization and limit energy consumption, while providing excellent user-perceived performance in the presence of uncertain and…
In geographically-distributed systems, communication latencies are non-negligible. The perceived processing time of a request is thus composed of the time needed to route the request to the server and the true processing time. Once a…
We consider a discrete-time parallel service system consisting of $n$ heterogeneous single server queues with infinite capacity. Jobs arrive to the system as an i.i.d. process with rate proportional to $n$, and must be immediately…
Load balancing across parallel servers is an important class of congestion control problems that arises in service systems. An effective load balancer relies heavily on accurate, real-time congestion information to make routing decisions.…
The basic load balancing scenario involves a single dispatcher where tasks arrive that must immediately be forwarded to one of $N$ single-server queues. We discuss recent advances on scalable load balancing schemes which provide favorable…
We consider a load balancing system consisting of $n$ single-server queues working in parallel, with heterogeneous service rates. Jobs arrive to a central dispatcher, which has to dispatch them to one of the queues immediately upon arrival.…
This paper considers the steady-state performance of load balancing algorithms in a many-server system with distributed queues. The system has $N$ servers, and each server maintains a local queue with buffer size $b-1,$ i.e. a server can…
Traffic load-balancing in datacenters alleviates hot spots and improves network utilization. In this paper, a stable in-network load-balancing algorithm is developed in the setting of software-defined networking. A control plane configures…
We consider a large-scale service system where incoming tasks have to be instantaneously dispatched to one out of many parallel server pools. The user-perceived performance degrades with the number of concurrent tasks and the dispatcher…
Load balancing algorithms play a vital role in enhancing performance in data centers and cloud networks. Due to the massive size of these systems, scalability challenges, and especially the communication overhead associated with load…
Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best…
Scalable load balancing algorithms are of great interest in cloud networks and data centers, necessitating the use of tractable techniques to compute optimal load balancing policies for good performance. However, most existing scalable…
Recent years have seen a great increase in the capacity and parallel processing power of data centers and cloud services. To fully utilize the said distributed systems, optimal load balancing for parallel queuing architectures must be…
We consider the fundamental problem of managing a bounded size queue buffer where traffic consists of packets of varying size, where each packet requires several rounds of processing before it can be transmitted from the queue buffer. The…
We present here a cost effective framework for a robust scalable and distributed job processing system that adapts to the dynamic computing needs easily with efficient load balancing for heterogeneous systems. The design is such that each…
We present an overview of scalable load balancing algorithms which provide favorable delay performance in large-scale systems, and yet only require minimal implementation overhead. Aimed at a broad audience, the paper starts with an…
In large-scale distributed systems, balancing the load in an efficient way is crucial in order to achieve low latency. Recently, some load balancing policies have been suggested which are able to achieve a bounded maximum queue length in…
Current-day data centers and high-volume cloud services employ a broad set of heterogeneous servers. In such settings, client requests typically arrive at multiple entry points, and dispatching them to servers is an urgent distributed…
We study the scheduling polices for asymptotically optimal delay in queueing systems with switching overhead. Such systems consist of a single server that serves multiple queues, and some capacity is lost whenever the server switches to…