Related papers: Invalidation-Based Protocols for Replicated Datast…

Reliable Replication Protocols on SmartNICs

Today's datacenter applications rely on datastores that are required to provide high availability, consistency, and performance. To achieve high availability, these datastores replicate data across several nodes. Such replication is managed…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-25 M. R. Siavash Katebzadeh , Antonios Katsarakis , Boris Grot

MDCC: Multi-Data Center Consistency

Replicating data across multiple data centers not only allows moving the data closer to the user and, thus, reduces latency for applications, but also increases the availability in the event of a data center failure. Therefore, it is not…

Databases · Computer Science 2012-03-28 Tim Kraska , Gene Pang , Michael J. Franklin , Samuel Madden

ReStore: In-Memory REplicated STORagE for Rapid Recovery in Fault-Tolerant Algorithms

Fault-tolerant distributed applications require mechanisms to recover data lost via a process failure. On modern cluster systems it is typically impractical to request replacement resources after such a failure. Therefore, applications have…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-26 Lukas Hübner , Demian Hespe , Peter Sanders , Alexandros Stamatakis

Hermes: a Fast, Fault-Tolerant and Linearizable Replication Protocol

Today's datacenter applications are underpinned by datastores that are responsible for providing availability, consistency, and performance. For high availability in the presence of failures, these datastores replicate data across several…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-28 A. Katsarakis , V. Gavrielatos , M. Katebzadeh , A. Joshi , A. Dragojevic , B. Grot , V. Nagarajan

DXRAM's Fault-Tolerance Mechanisms Meet High Speed I/O Devices

In-memory key-value stores provide consistent low-latency access to all objects which is important for interactive large-scale applications like social media networks or online graph analytics and also opens up new application areas. But,…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-07-17 Kevin Beineke , Stefan Nothaas , Michael Schoettner

Quality-of-Data for Consistency Levels in Geo-replicated Cloud Data Stores

Cloud computing has recently emerged as a key technology to provide individuals and companies with access to remote computing and storage infrastructures. In order to achieve highly-available yet high-performing services, cloud data stores…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-08-10 Álvaro García-Recuero , Sérgio Esteves , Luís Veiga

Fault-Tolerant Partial Replication in Large-Scale Database Systems

We investigate a decentralised approach to committing transactions in a replicated database, under partial replication. Previous protocols either re-execute transactions entirely and/or compute a total order of transactions. In contrast,…

Databases · Computer Science 2009-09-29 Pierre Sutra , Marc Shapiro

Reliable Data Storage in Distributed Hash Tables

Distributed Hash Tables offer a resilient lookup service for unstable distributed environments. Resilient data storage, however, requires additional data replication and maintenance algorithms. These algorithms can have an impact on both…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Matthew Leslie

A distributed Integrity Catalog for digital repositories

Digital repositories, either digital preservation systems or archival systems, periodically check the integrity of stored objects to assure users of their correctness. To do so, prior solutions calculate integrity metadata and require the…

Databases · Computer Science 2014-09-26 Nikos Chondros , Mema Roussopoulos

Data Placement and Replica Selection for Improving Co-location in Distributed Environments

Increasing need for large-scale data analytics in a number of application domains has led to a dramatic rise in the number of distributed data management systems, both parallel relational databases, and systems that support alternative…

Databases · Computer Science 2013-02-19 K. Ashwin Kumar , Amol Deshpande , Samir Khuller

Optimistic Parallel State-Machine Replication

State-machine replication, a fundamental approach to fault tolerance, requires replicas to execute commands deterministically, which usually results in sequential execution of commands. Sequential execution limits performance and underuses…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-04-29 Parisa Jalili Marandi , Fernando Pedone

DynoStore: A wide-area distribution system for the management of data over heterogeneous storage

Data distribution across different facilities offers benefits such as enhanced resource utilization, increased resilience through replication, and improved performance by processing data near its source. However, managing such data is…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-02 Dante D. Sanchez-Gallegos , J. L. Gonzalez-Compean , Maxime Gonthier , Valerie Hayot-Sasson , J. Gregory Pauloski , Haochen Pan , Kyle Chard , Jesus Carretero , Ian Foster

Management of Data Replication for PC Cluster-based Cloud Storage System

Storage systems are essential building blocks for cloud computing infrastructures. Although high performance storage servers are the ultimate solution for cloud storage, the implementation of inexpensive storage system remains an open…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-12-30 Julia Myint , Thinn Thu Naing

UniStore: A fault-tolerant marriage of causal and strong consistency (extended version)

Modern online services rely on data stores that replicate their data across geographically distributed data centers. Providing strong consistency in such data stores results in high latencies and makes the system vulnerable to network…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-07-30 Manuel Bravo , Alexey Gotsman , Borja de Régil , Hengfeng Wei

FASTEN: Towards a FAult-tolerant and STorage EfficieNt Cloud: Balancing Between Replication and Deduplication

With the surge in cloud storage adoption, enterprises face challenges managing data duplication and exponential data growth. Deduplication mitigates redundancy, yet maintaining redundancy ensures high availability, incurring storage costs.…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-10 Sabbir Ahmed , Md Nahiduzzaman , Tariqul Islam , Faisal Haque Bappy , Tarannum Shaila Zaman , Raiful Hasan

STAR: Scaling Transactions through Asymmetric Replication

In this paper, we present STAR, a new distributed in-memory database with asymmetric replication. By employing a single-node non-partitioned architecture for some replicas and a partitioned architecture for other replicas, STAR is able to…

Databases · Computer Science 2019-07-23 Yi Lu , Xiangyao Yu , Samuel Madden

Performance modeling of a distributed file-system

Data centers have become center of big data processing. Most programs running in a data center processes big data. The storage requirements of such programs cannot be fulfilled by a single node in the data center, and hence a distributed…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-28 Sandeep Kumar

HyCoR: Fault-Tolerant Replicated Containers Based on Checkpoint and Replay

HyCoR is a fully-operational fault tolerance mechanism for multiprocessor workloads, based on container replication, using a hybrid of checkpointing and replay. HyCoR derives from two insights regarding replication mechanisms: 1)…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-26 Diyu Zhou , Yuval Tamir

Non-uniform Replication

Replication is a key technique in the design of efficient and reliable distributed systems. As information grows, it becomes difficult or even impossible to store all information at every replica. A common approach to deal with this problem…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-05 Gonçalo Cabrita , Nuno Preguiça

The Case for Replication-Aware Memory-Error Protection in Disaggregated Memory

Disaggregated memory leverages recent technology advances in high-density, byte-addressable non-volatile memory and high-performance interconnects to provide a large memory pool shared across multiple compute nodes. Due to higher memory…

Hardware Architecture · Computer Science 2024-06-10 Haris Volos