Related papers: Self-Learning Threshold-Based Load Balancing
A fundamental challenge in large-scale networked systems viz., data centers and cloud networks is to distribute tasks to a pool of servers, using minimal instantaneous state information, while providing excellent delay performance. In this…
Consider a system of identical server pools where tasks with exponentially distributed service times arrive as a time-inhomogenenous Poisson process. An admission threshold is used in an inner control loop to assign incoming tasks to server…
We consider a system with several job types and two parallel server pools. Within the pools the servers are homogeneous, but across pools possibly not in the sense that the service speed of a job may depend on its type as well as the server…
Load balancing plays a critical role in efficiently dispatching jobs in parallel-server systems such as cloud networks and data centers. A fundamental challenge in the design of load balancing algorithms is to achieve an optimal trade-off…
Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best…
We consider a load balancing system consisting of $n$ single-server queues working in parallel, with heterogeneous service rates. Jobs arrive to a central dispatcher, which has to dispatch them to one of the queues immediately upon arrival.…
Load Balancing is a fundamental technology for scaling cloud infrastructure. It enables systems to distribute incoming traffic across backend servers using predefined algorithms such as round robin, weighted round robin, least connections,…
We consider the load balancing problem in large-scale heterogeneous systems with multiple dispatchers. We introduce a general framework called Local-Estimation-Driven (LED). Under this framework, each dispatcher keeps local (possibly…
In geographically-distributed systems, communication latencies are non-negligible. The perceived processing time of a request is thus composed of the time needed to route the request to the server and the true processing time. Once a…
A fundamental challenge in large-scale cloud networks and data centers is to achieve highly efficient server utilization and limit energy consumption, while providing excellent user-perceived performance in the presence of uncertain and…
In this paper, we consider a load balancing system under a general pull-based policy. In particular, each arrival is randomly dispatched to one of the servers whose queue lengths are below a threshold, if there are any; otherwise, this…
Consider a service system where incoming tasks are instantaneously dispatched to one out of many heterogeneous server pools. Associated with each server pool is a concave utility function which depends on the class of the server pool and…
Parallel iterative applications often suffer from load imbalance, one of the most critical performance degradation factors. Hence, load balancing techniques are used to distribute the workload evenly to maximize performance. A key challenge…
Recent years have seen a great increase in the capacity and parallel processing power of data centers and cloud services. To fully utilize the said distributed systems, optimal load balancing for parallel queuing architectures must be…
We consider a system of $N$ identical server pools and a single dispatcher where tasks arrive as a Poisson process of rate $\lambda(N)$. Arriving tasks cannot be queued, and must immediately be assigned to one of the server pools to start…
We study probabilistic protocols for concurrent threshold-based load balancing in networks. There are n resources or machines represented by nodes in an undirected graph and m >> n users that try to find an acceptable resource by moving…
Nowadays, more and more increasingly hard computations are performed in challenging fields like weather forecasting, oil and gas exploration, and cryptanalysis. Many of such computations can be implemented using a computer cluster with a…
In this paper, we analyze the performance of random load resampling and migration strategies in parallel server systems. Clients initially attach to an arbitrary server, but may switch server independently at random instants of time in an…
We consider a simple computation offloading model where jobs can either be fully processed in the cloud or be partially processed at a local server before being sent to the cloud to complete processing. Our goal is to design a policy for…
Operating a distributed data stream processing workload efficiently at scale is hard. The operator of the workload must parallelize and lay out tasks of the workload with resources that match the requirement of target data rate. The…