Related papers: Coded Data Rebalancing: Fundamental Limits and Con…

Coded Data Rebalancing for Decentralized Distributed Databases

The performance of replication-based distributed databases is affected due to non-uniform storage across storage nodes (also called \textit{data skew}) and reduction in the replication factor during operation, particularly due to node…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-13 K V Sushena Sree , Prasad Krishnan

Coded Data Rebalancing for Distributed Data Storage Systems with Cyclic Storage

We consider replication-based distributed storage systems in which each node stores the same quantum of data and each data bit stored has the same replication factor across the nodes. Such systems are referred to as balanced distributed…

Information Theory · Computer Science 2024-12-13 Abhinav Vaishya , Athreya Chandramouli , Srikar Kale , Prasad Krishnan

Controlling Data Access Load in Distributed Systems

Distributed systems store data objects redundantly to balance the data access load over multiple nodes. Load balancing performance depends mainly on 1) the level of storage redundancy and 2) the assignment of data objects to storage nodes.…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-12-19 Mehmet Aktas , Emina Soljanin

Distributed Storage Allocations

We examine the problem of allocating a given total storage budget in a distributed storage system for maximum reliability. A source has a single data object that is to be coded and stored over a set of storage nodes; it is allowed to store…

Information Theory · Computer Science 2016-11-15 Derek Leong , Alexandros G. Dimakis , Tracey Ho

Near Optimal Coded Data Shuffling for Distributed Learning

Data shuffling between distributed cluster of nodes is one of the critical steps in implementing large-scale learning algorithms. Randomly shuffling the data-set among a cluster of workers allows different nodes to obtain fresh data…

Information Theory · Computer Science 2018-01-08 Mohamed A. Attia , Ravi Tandon

Evaluating Load Balancing Performance in Distributed Storage with Redundancy

To facilitate load balancing, distributed systems store data redundantly. We evaluate the load balancing performance of storage schemes in which each object is stored at $d$ different nodes, and each node stores the same number of objects.…

Performance · Computer Science 2021-01-26 Mehmet Fatih Aktas , Amir Behrouzi-Far , Emina Soljanin , Philip Whiting

Cost-Bandwidth Tradeoff In Distributed Storage Systems

Distributed storage systems are mainly justified due to the limited amount of storage capacity and improving the reliability through distributing data over multiple storage nodes. On the other hand, it may happen the data is stored in…

Information Theory · Computer Science 2010-04-15 Soroush Akhlaghi , Abbas Kiani , Mohammad Reza Ghanavati

Capacity bounds for distributed storage

One of the primary objectives of a distributed storage system is to reliably store large amounts of source data for long durations using a large number $N$ of unreliable storage nodes, each with $c$ bits of storage capacity. Storage nodes…

Information Theory · Computer Science 2018-04-13 Michael Luby

Cost Analysis of Redundancy Schemes for Distributed Storage Systems

Distributed storage infrastructures require the use of data redundancy to achieve high data reliability. Unfortunately, the use of redundancy introduces storage and communication overheads, which can either reduce the overall storage…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-03-19 Lluis Pamies-Juarez , Ernst Biersack

Distributed storage algorithms with optimal tradeoffs

One of the primary objectives of a distributed storage system is to reliably store large amounts of source data for long durations using a large number $N$ of unreliable storage nodes, each with $c$ bits of storage capacity. Storage nodes…

Information Theory · Computer Science 2021-01-14 Michael Luby , Thomas Richardson

Network Coding for Distributed Storage Systems

Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing…

Networking and Internet Architecture · Computer Science 2008-03-06 Alexandros G. Dimakis , P. Brighten Godfrey , Yunnan Wu , Martin J. Wainwright , Kannan Ramchandran

On the Worst-case Communication Overhead for Distributed Data Shuffling

Distributed learning platforms for processing large scale data-sets are becoming increasingly prevalent. In typical distributed implementations, a centralized master node breaks the data-set into smaller batches for parallel processing…

Information Theory · Computer Science 2016-10-03 Mohamed Attia , Ravi Tandon

Storage Allocation for Multi-Class Distributed Data Storage Systems

Distributed storage systems (DSSs) provide a scalable solution for reliably storing massive amounts of data coming from various sources. Heterogeneity of these data sources often means different data classes (types) exist in a DSS, each…

Information Theory · Computer Science 2017-01-24 Koosha Pourtahmasi Roshandeh , Moslem Noori , Masoud Ardakani , Chintha Tellambura

Multi-Rack Distributed Data Storage Networks

The majority of works in distributed storage networks assume a simple network model with a collection of identical storage nodes with the same communication cost between the nodes. In this paper, we consider a realistic multi-rack…

Information Theory · Computer Science 2019-03-11 Ali Tebbi , Terence H. Chan , Chi Wan Sung

Quality-of-Data for Consistency Levels in Geo-replicated Cloud Data Stores

Cloud computing has recently emerged as a key technology to provide individuals and companies with access to remote computing and storage infrastructures. In order to achieve highly-available yet high-performing services, cloud data stores…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-08-10 Álvaro García-Recuero , Sérgio Esteves , Luís Veiga

Coded Caching with Distributed Storage

Content delivery networks store information distributed across multiple servers, so as to balance the load and avoid unrecoverable losses in case of node or disk failures. Coded caching has been shown to be a useful technique which can…

Information Theory · Computer Science 2016-11-22 Tianqiong Luo , Vaneet Aggarwal , Borja Peleato

Balancing Fixed Number of Nodes Among Multiple Fixed Clusters

Cloud infrastructure users often allocate a fixed number of nodes to individual container clusters (e.g., Kubernetes, OpenShift), resulting in underutilization of computing resources due to asynchronous and variable workload peaks across…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-11 Paritosh Ranjan , Surajit Majumder , Prodip Roy , Bhuban Padhan

Capacity of Distributed Storage Systems with Clusters and Separate Nodes

In distributed storage systems (DSSs), the optimal tradeoff between node storage and repair bandwidth is an important issue for designing distributed coding strategies to ensure large scale data reliability. The capacity of DSSs is obtained…

Information Theory · Computer Science 2019-01-11 Jingzhao Wang , Tinghan Wang , Yuan Luo , Kenneth W. Shum

An Empirical Study of the Repair Performance of Novel Coding Schemes for Networked Distributed Storage Systems

Erasure coding techniques are getting integrated in networked distributed storage systems as a way to provide fault-tolerance at the cost of less storage overhead than traditional replication. Redundancy is maintained over time through…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-06-12 Lluis Pamies-Juarez , Frédérique Oggier , Anwitaman Datta

Erasure Coding for Distributed Storage: An Overview

In a distributed storage system, code symbols are dispersed across space in nodes or storage units as opposed to time. In settings such as that of a large data center, an important consideration is the efficient repair of a failed node.…

Information Theory · Computer Science 2018-06-13 S. B. Balaji , M. Nikhil Krishnan , Myna Vajha , Vinayak Ramkumar , Birenjith Sasidharan , P. Vijay Kumar