English

Cost-Driven Data Replication with Predictions

Data Structures and Algorithms 2024-04-26 v1

Abstract

This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. We develop an online algorithm and prove that it is (5+α3\frac{5+\alpha}{3})-consistent (competitiveness under perfect predictions) and (1+1α1 + \frac{1}{\alpha})-robust (competitiveness under terrible predictions), where α(0,1]\alpha \in (0, 1] is a hyper-parameter representing the level of distrust in the predictions. We also study the impact of mispredictions on the competitive ratio of the proposed algorithm and adapt it to achieve a bounded robustness while retaining its consistency. We further establish a lower bound of 32\frac{3}{2} on the consistency of any deterministic learning-augmented algorithm. Experimental evaluations are carried out to evaluate our algorithms using real data access traces.

Keywords

Cite

@article{arxiv.2404.16489,
  title  = {Cost-Driven Data Replication with Predictions},
  author = {Tianyu Zuo and Xueyan Tang and Bu Sung Lee},
  journal= {arXiv preprint arXiv:2404.16489},
  year   = {2024}
}

Comments

The formal version of this draft will appear in ACM SPAA'24 conference