Related papers: Load Balancing Using Sparse Communication
Load balancing plays a critical role in efficiently dispatching jobs in parallel-server systems such as cloud networks and data centers. A fundamental challenge in the design of load balancing algorithms is to achieve an optimal trade-off…
Scalable load balancing algorithms are of great interest in cloud networks and data centers, necessitating the use of tractable techniques to compute optimal load balancing policies for good performance. However, most existing scalable…
In geographically-distributed systems, communication latencies are non-negligible. The perceived processing time of a request is thus composed of the time needed to route the request to the server and the true processing time. Once a…
Load balancing algorithms play a vital role in enhancing performance in data centers and cloud networks. Due to the massive size of these systems, scalability challenges, and especially the communication overhead associated with load…
In the load-balancing problem, we have an $n$-vertex bipartite graph $G=(L, R, E)$ between a set of clients and servers. The goal is to find an assignment of all clients to the servers, while minimizing the maximum load on each server,…
While sparse attention mitigates the computational bottleneck of long-context LLM training, its distributed training process exhibits extreme heterogeneity in both \textit{1)} sequence length and \textit{2)} sparsity sensitivity, leading to…
The paper is devoted to the approximate consensus problem for networks of nonlinear agents with switching topology, noisy and delayed measurements. In contrast to the existing stochastic approximation-based control algorithms (protocols), a…
Current-day data centers and high-volume cloud services employ a broad set of heterogeneous servers. In such settings, client requests typically arrive at multiple entry points, and dispatching them to servers is an urgent distributed…
Parallel applications with irregular and time-varying workloads often suffer from load imbalance. Dynamic load balancing techniques address this challenge by redistributing work during execution. We present a new type of distributed…
Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best…
Load Balancing is a fundamental technology for scaling cloud infrastructure. It enables systems to distribute incoming traffic across backend servers using predefined algorithms such as round robin, weighted round robin, least connections,…
This paper considers the massive connectivity problem in an asynchronous grant-free random access system, where a huge number of devices sporadically transmit data to a base station (BS) with imperfect synchronization. The goal is to design…
In this paper, we analyze the performance of random load resampling and migration strategies in parallel server systems. Clients initially attach to an arbitrary server, but may switch server independently at random instants of time in an…
We consider load balancing in a network of caching servers delivering contents to end users. Randomized load balancing via the so-called power of two choices is a well-known approach in parallel and distributed systems that reduces network…
Network alignment generalizes and unifies several approaches for forming a matching or alignment between the vertices of two graphs. We study a mathematical programming framework for network alignment problem and a sparse variation of it…
Load balancing algorithms play critical roles in systems where the workload has to be distributed across multiple resources, such as cores in multiprocessor system, computers in distributed computing, and network links. In this paper, we…
We consider load balancing in a network of caching servers delivering contents to end users. Randomized load balancing via the so-called power of two choices is a well-known approach in parallel and distributed systems. In this framework,…
In this paper, a communication-efficient multi-processor compressed sensing framework based on the approximate message passing algorithm is proposed. We perform lossy compression on the data being communicated between processors, resulting…
We consider a large-scale service system where incoming tasks have to be instantaneously dispatched to one out of many parallel server pools. The user-perceived performance degrades with the number of concurrent tasks and the dispatcher…
Oblivious load-balancing in networks involves routing traffic from sources to destinations using predetermined routes independent of the traffic, so that the maximum load on any link in the network is minimized. We investigate oblivious…