English

Clustered Network Coding for Maintenance in Practical Storage Systems

Distributed, Parallel, and Cluster Computing 2012-06-20 v1

Abstract

Classical erasure codes, e.g. Reed-Solomon codes, have been acknowledged as an efficient alternative to plain replication to reduce the storage overhead in reliable distributed storage systems. Yet, such codes experience high overhead during the maintenance process. In this paper we propose a novel erasure-coded framework especially tailored for networked storage systems. Our approach relies on the use of random codes coupled with a clustered placement strategy, enabling the maintenance of a failed machine at the granularity of multiple files. Our repair protocol leverages network coding techniques to reduce by half the amount of data transferred during maintenance, as several files can be repaired simultaneously. This approach, as formally proven and demonstrated by our evaluation on a public experimental testbed, enables to dramatically decrease the bandwidth overhead during the maintenance process, as well as the time to repair a failure. In addition, the implementation is made as simple as possible, aiming at a deployment into practical systems.

Keywords

Cite

@article{arxiv.1206.4175,
  title  = {Clustered Network Coding for Maintenance in Practical Storage Systems},
  author = {Anne-Marie Kermarrec and Erwan Le Merrer and Gilles Straub and Alexandre van Kempen},
  journal= {arXiv preprint arXiv:1206.4175},
  year   = {2012}
}

Comments

14 pages, 13 figures

R2 v1 2026-06-21T21:21:49.160Z