Related papers: Coded Data Rebalancing for Decentralized Distribut…
Distributed databases often suffer unequal distribution of data among storage nodes, which is known as `data skew'. Data skew arises from a number of causes such as removal of existing storage nodes and addition of new empty nodes to the…
We consider replication-based distributed storage systems in which each node stores the same quantum of data and each data bit stored has the same replication factor across the nodes. Such systems are referred to as balanced distributed…
Data shuffling between distributed cluster of nodes is one of the critical steps in implementing large-scale learning algorithms. Randomly shuffling the data-set among a cluster of workers allows different nodes to obtain fresh data…
Distributed systems store data objects redundantly to balance the data access load over multiple nodes. Load balancing performance depends mainly on 1) the level of storage redundancy and 2) the assignment of data objects to storage nodes.…
Distributed storage systems with replication are well known for storing large amount of data. A large number of replication is done in order to provide reliability. This makes the system expensive. Various methods have been proposed over…
Content delivery networks store information distributed across multiple servers, so as to balance the load and avoid unrecoverable losses in case of node or disk failures. Coded caching has been shown to be a useful technique which can…
We consider load balancing problem in a cache network consisting of storage-enabled servers forming a distributed content delivery scenario. Previously proposed load balancing solutions cannot perfectly balance out requests among servers,…
In a distributed storage system, code symbols are dispersed across space in nodes or storage units as opposed to time. In settings such as that of a large data center, an important consideration is the efficient repair of a failed node.…
We examine the problem of allocating a given total storage budget in a distributed storage system for maximum reliability. A source has a single data object that is to be coded and stored over a set of storage nodes; it is allowed to store…
Coded distributed computing can alleviate the communication load by leveraging the redundant storage and computation resources with coding techniques in distributed computing. In this paper, we study a MapReduce-type distributed computing…
Distributed learning platforms for processing large scale data-sets are becoming increasingly prevalent. In typical distributed implementations, a centralized master node breaks the data-set into smaller batches for parallel processing…
Data shuffling of training data among different computing nodes (workers) has been identified as a core element to improve the statistical performance of modern large-scale machine learning algorithms. Data shuffling is often considered as…
In distributed computing systems with stragglers, various forms of redundancy can improve the average delay performance. We study the optimal replication of data in systems where the job execution time is a stochastically decreasing and…
Cloud computing has recently emerged as a key technology to provide individuals and companies with access to remote computing and storage infrastructures. In order to achieve highly-available yet high-performing services, cloud data stores…
Many large-scale machine learning (ML) applications need to perform decentralized learning over datasets generated at different devices and locations. Such datasets pose a significant challenge to decentralized learning because their…
Replicating or caching popular content in memories distributed across the network is a technique to reduce peak network loads. Conventionally, the main performance gain of this caching was thought to result from making part of the requested…
We consider the data shuffling problem in a distributed learning system, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes. The master node has access…
Coded caching is a technique that leverages locally cached contents at the end users to reduce the network's peak-time communication load. Coded caching has been shown to achieve significant performance gains with a centralized placement…
We study the problem of the reconstruction of a Gaussian field defined in [0,1] using N sensors deployed at regular intervals. The goal is to quantify the total data rate required for the reconstruction of the field with a given mean square…
We consider large-scale wireless sensor networks with $n$ nodes, out of which k are in possession, (e.g., have sensed or collected in some other way) k information packets. In the scenarios in which network nodes are vulnerable because of,…