Related papers: Controlling Data Access Load in Distributed System…
To facilitate load balancing, distributed systems store data redundantly. We evaluate the load balancing performance of storage schemes in which each object is stored at $d$ different nodes, and each node stores the same number of objects.…
Load balancing is critical for distributed storage to meet strict service-level objectives (SLOs). It has been shown that a fast cache can guarantee load balancing for a clustered storage system. However, when the system scales out to…
We consider replication-based distributed storage systems in which each node stores the same quantum of data and each data bit stored has the same replication factor across the nodes. Such systems are referred to as balanced distributed…
We examine the problem of allocating a given total storage budget in a distributed storage system for maximum reliability. A source has a single data object that is to be coded and stored over a set of storage nodes; it is allowed to store…
A new system model reflecting the clustered structure of distributed storage is suggested to investigate interplay between storage overhead and repair bandwidth as storage node failures occur. Large data centers with multiple racks/disks or…
A new system model reflecting the clustered structure of distributed storage is suggested to investigate bandwidth requirements for repairing failed storage nodes. Large data centers with multiple racks/disks or local networks of storage…
Distributed databases often suffer unequal distribution of data among storage nodes, which is known as `data skew'. Data skew arises from a number of causes such as removal of existing storage nodes and addition of new empty nodes to the…
Load Balancing is a fundamental technology for scaling cloud infrastructure. It enables systems to distribute incoming traffic across backend servers using predefined algorithms such as round robin, weighted round robin, least connections,…
Distributed computing systems implement redundancy to reduce the job completion time and variability. Despite a large body of work about computing redundancy, the analytical performance evaluation of redundancy techniques in queuing systems…
With the rapid development in wide area networks and low cost, powerful computational resources, grid computing has gained its popularity. With the advent of grid computing, space limitations of conventional distributed systems can be…
Distributed storage systems (DSSs) provide a scalable solution for reliably storing massive amounts of data coming from various sources. Heterogeneity of these data sources often means different data classes (types) exist in a DSS, each…
Redundant storage maintains the performance of distributed systems under various forms of uncertainty. This paper considers the uncertainty in node access and download service. We consider two access models under two download service…
Current-day data centers and high-volume cloud services employ a broad set of heterogeneous servers. In such settings, client requests typically arrive at multiple entry points, and dispatching them to servers is an urgent distributed…
A parallel computer system is a collection of processing elements that communicate and cooperate to solve large computational problems efficiently. To achieve this, at first the large computational problem is partitioned into several tasks…
Cloud infrastructure users often allocate a fixed number of nodes to individual container clusters (e.g., Kubernetes, OpenShift), resulting in underutilization of computing resources due to asynchronous and variable workload peaks across…
Recently, coding has been a useful technique to mitigate the effect of stragglers in distributed computing. However, coding in this context has been mainly explored under the assumption of homogeneous workers, although the real-world…
We consider load balancing in a network of caching servers delivering contents to end users. Randomized load balancing via the so-called power of two choices is a well-known approach in parallel and distributed systems. In this framework,…
The classification of the most used load balancing algorithms in distributed systems (including cloud technology, cluster systems, grid systems) is described. Comparative analysis of types of the load balancing algorithms is conducted in…
Cloud resource management is often modeled by two-dimensional bin packing with a set of items that correspond to tasks having fixed CPU and memory requirements. However, applications running in clouds are much more flexible: modern…
A fundamental challenge in large-scale networked systems viz., data centers and cloud networks is to distribute tasks to a pool of servers, using minimal instantaneous state information, while providing excellent delay performance. In this…