Related papers: Parallel Load Balancing on Constrained Client-Serv…
In computer networks, participants may cooperate in processing tasks, so that loads are balanced among them. We present local distributed algorithms that (repeatedly) use local imbalance criteria to transfer loads concurrently across the…
A parallel server system is a stochastic processing network with applications in manufacturing, supply chain, ride-hailing, call centers, etc. Heterogeneous customers arrive in the system, and only a subset of servers can serve any customer…
We study large-scale systems operating under the JSQ$(d)$ policy in the presence of stringent task-server compatibility constraints. Consider a system with $N$ identical single-server queues and $M(N)$ task types, where each server is able…
We consider a large-scale parallel-server system, where each server independently adjusts its processing speed in a decentralized manner. The objective is to minimize the overall cost, which comprises the average cost of maintaining the…
Modern computing workloads are often composed of parallelizable jobs. A parallelizable job can be completed more quickly when run on additional servers. However, each job can only use a limited number of servers, known as its…
We consider a system of $N$ servers inter-connected by some underlying graph topology $G_N$. Tasks arrive at the various servers as independent Poisson processes of rate $\lambda$. Each incoming task is irrevocably assigned to whichever…
In the weighted load balancing problem, the input is an $n$-vertex bipartite graph between a set of clients and a set of servers, and each client comes with some nonnegative real weight. The output is an assignment that maps each client to…
In the load-balancing problem, we have an $n$-vertex bipartite graph $G=(L, R, E)$ between a set of clients and servers. The goal is to find an assignment of all clients to the servers, while minimizing the maximum load on each server,…
Load balancing plays a critical role in efficiently dispatching jobs in parallel-server systems such as cloud networks and data centers. A fundamental challenge in the design of load balancing algorithms is to achieve an optimal trade-off…
We develop a novel parallel decomposition strategy for unweighted, undirected graphs, based on growing disjoint connected clusters from batches of centers progressively selected from yet uncovered nodes. With respect to similar previous…
We consider a model inspired by compatibility constraints that arise between tasks and servers in data centers, cloud computing systems and content delivery networks. The constraints are represented by a bipartite graph or network that…
We study probabilistic protocols for concurrent threshold-based load balancing in networks. There are n resources or machines represented by nodes in an undirected graph and m >> n users that try to find an acceptable resource by moving…
We consider a parallel system of $m$ identical machines prone to unpredictable crashes and restarts, trying to cope with the continuous arrival of tasks to be executed. Tasks have different computational requirements (i.e., processing time…
Modern data centers serve workloads which are capable of exploiting parallelism. When a job parallelizes across multiple servers it will complete more quickly, but jobs receive diminishing returns from being allocated additional servers.…
Motivated by modern parallel computing applications, we consider the problem of scheduling parallel-task jobs with heterogeneous resource requirements in a cluster of machines. Each job consists of a set of tasks that can be processed in…
In this paper, we investigate the problem of assignment of $K$ identical servers to a set of $N$ parallel queues in a time slotted queueing system. The connectivity of each queue to each server is randomly changing with time; each server…
In the load balancing problem, the input is an $n$-vertex bipartite graph $G = (C \cup S, E)$ and a positive weight for each client $c \in C$. The algorithm must assign each client $c \in C$ to an adjacent server $s \in S$. The load of a…
In geographically-distributed systems, communication latencies are non-negligible. The perceived processing time of a request is thus composed of the time needed to route the request to the server and the true processing time. Once a…
In distributed ML applications, shared parameters are usually replicated among computing nodes to minimize network overhead. Therefore, proper consistency model must be carefully chosen to ensure algorithm's correctness and provide high…
In this paper we consider neighborhood load balancing in the context of selfish clients. We assume that a network of n processors and m tasks is given. The processors may have different speeds and the tasks may have different weights. Every…