English

Reconfigurable Atomic Transaction Commit (Extended Version)

Distributed, Parallel, and Cluster Computing 2019-06-05 v1

Abstract

Modern data stores achieve scalability by partitioning data into shards and fault-tolerance by replicating each shard across several servers. A key component of such systems is a Transaction Certification Service (TCS), which atomically commits a transaction spanning multiple shards. Existing TCS protocols require 2f+1 crash-stop replicas per shard to tolerate f failures. In this paper we present atomic commit protocols that require only f+1 replicas and reconfigure the system upon failures using an external reconfiguration service. We furthermore rigorously prove that these protocols correctly implement a recently proposed TCS specification. We present protocols in two different models--the standard asynchronous message-passing model and a model with Remote Direct Memory Access (RDMA), which allows a machine to access the memory of another machine over the network without involving the latter's CPU. Our protocols are inspired by a recent FARM system for RDMA-based transaction processing. Our work codifies the core ideas of FARM as distributed TCS protocols, rigorously proves them correct and highlights the trade-offs required by the use of RDMA.

Keywords

Cite

@article{arxiv.1906.01365,
  title  = {Reconfigurable Atomic Transaction Commit (Extended Version)},
  author = {Manuel Bravo and Alexey Gotsman},
  journal= {arXiv preprint arXiv:1906.01365},
  year   = {2019}
}

Comments

Extended version of the PODC' 19 paper: Reconfigurable Atomic Transaction Commit