English

Rack-Aware Regenerating Codes for Data Centers

Information Theory 2019-02-26 v2 math.IT

Abstract

Erasure coding is widely used for massive storage in data centers to achieve high fault tolerance and low storage redundancy. Since the cross-rack communication cost is often high, it is critical to design erasure codes that minimize the cross-rack repair bandwidth during failure repair. In this paper, we analyze the optimal trade-off between storage redundancy and cross-rack repair bandwidth specifically for data centers, subject to the condition that the original data can be reconstructed from a sufficient number of any non-failed nodes. We characterize the optimal trade-off curve under functional repair, and propose a general family of erasure codes called rack-aware regenerating codes (RRC), which achieve the optimal trade-off. We further propose exact repair constructions of RRC that have minimum storage redundancy and minimum cross-rack repair bandwidth, respectively. We show that (i) the minimum storage redundancy constructions support a wide range of parameters and have cross-rack repair bandwidth that is strictly less than that of the classical minimum storage regenerating codes in most cases, and (ii) the minimum cross-rack repair bandwidth constructions support all the parameters and have less cross-rack repair bandwidth than that of the minimum bandwidth regenerating codes for almost all of the parameters.

Keywords

Cite

@article{arxiv.1802.04031,
  title  = {Rack-Aware Regenerating Codes for Data Centers},
  author = {Hanxu Hou and Patrick P. C. Lee and Kenneth W. Shum and Yuchong Hu},
  journal= {arXiv preprint arXiv:1802.04031},
  year   = {2019}
}