Related papers: Reliable Replication Protocols on SmartNICs

Invalidation-Based Protocols for Replicated Datastores

Distributed in-memory datastores underpin cloud applications that run within a datacenter and demand high performance, strong consistency, and availability. A key feature of datastores is data replication. The data are replicated across…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-07 Antonios Katsarakis

Hermes: a Fast, Fault-Tolerant and Linearizable Replication Protocol

Today's datacenter applications are underpinned by datastores that are responsible for providing availability, consistency, and performance. For high availability in the presence of failures, these datastores replicate data across several…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-28 A. Katsarakis , V. Gavrielatos , M. Katebzadeh , A. Joshi , A. Dragojevic , B. Grot , V. Nagarajan

Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore

Spinnaker is an experimental datastore that is designed to run on a large cluster of commodity servers in a single datacenter. It features key-based range partitioning, 3-way replication, and a transactional get-put API with the option to…

Databases · Computer Science 2011-03-15 Jun Rao , Eugene J. Shekita , Sandeep Tata

Reliable Data Storage in Distributed Hash Tables

Distributed Hash Tables offer a resilient lookup service for unstable distributed environments. Resilient data storage, however, requires additional data replication and maintenance algorithms. These algorithms can have an impact on both…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Matthew Leslie

MDCC: Multi-Data Center Consistency

Replicating data across multiple data centers not only allows moving the data closer to the user and, thus, reduces latency for applications, but also increases the availability in the event of a data center failure. Therefore, it is not…

Databases · Computer Science 2012-03-28 Tim Kraska , Gene Pang , Michael J. Franklin , Samuel Madden

Building Blocks for Network-Accelerated Distributed File Systems

High-performance clusters and datacenters pose increasingly demanding requirements on storage systems. If these systems do not operate at scale, applications are doomed to become I/O bound and waste compute cycles. To accelerate the data…

Networking and Internet Architecture · Computer Science 2022-06-22 Salvatore Di Girolamo , Daniele De Sensi , Konstantin Taranov , Milos Malesevic , Maciej Besta , Timo Schneider , Severin Kistler , Torsten Hoefler

Quality-of-Data for Consistency Levels in Geo-replicated Cloud Data Stores

Cloud computing has recently emerged as a key technology to provide individuals and companies with access to remote computing and storage infrastructures. In order to achieve highly-available yet high-performing services, cloud data stores…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-08-10 Álvaro García-Recuero , Sérgio Esteves , Luís Veiga

Scaling Strongly Consistent Replication

Strong consistency replication helps keep application logic simple and provides significant benefits for correctness and manageability. Unfortunately, the adoption of strongly-consistent replication protocols has been curbed due to their…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-22 Aleksey Charapko , Ailidani Ailijiang , Murat Demirbas

RCC: Resilient Concurrent Consensus for High-Throughput Secure Transaction Processing

Recently, we saw the emergence of consensus-based database systems that promise resilience against failures, strong data provenance, and federated data management. Typically, these fully-replicated systems are operated on top of a…

Databases · Computer Science 2020-11-04 Suyash Gupta , Jelle Hellings , Mohammad Sadoghi

ReStore: In-Memory REplicated STORagE for Rapid Recovery in Fault-Tolerant Algorithms

Fault-tolerant distributed applications require mechanisms to recover data lost via a process failure. On modern cluster systems it is typically impractical to request replacement resources after such a failure. Therefore, applications have…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-26 Lukas Hübner , Demian Hespe , Peter Sanders , Alexandros Stamatakis

RIFL: A Reliable Link Layer Network Protocol for Data Center Communication

More and more latency-sensitive services and applications are being deployed into the data center. Performance can be limited by the high latency of the network interconnect. Because the conventional network stack is designed not only for…

Networking and Internet Architecture · Computer Science 2023-09-19 Qianfeng , Shen , Jun Zheng , Paul Chow

A Primer on RecoNIC: RDMA-enabled Compute Offloading on SmartNIC

Today's data centers consist of thousands of network-connected hosts, each with CPUs and accelerators such as GPUs and FPGAs. These hosts also contain network interface cards (NICs), operating at speeds of 100Gb/s or higher, that are used…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-12-12 Guanwen Zhong , Aditya Kolekar , Burin Amornpaisannon , Inho Choi , Haris Javaid , Mario Baldi

The Homeostasis Protocol: Avoiding Transaction Coordination Through Program Analysis

Datastores today rely on distribution and replication to achieve improved performance and fault-tolerance. But correctness of many applications depends on strong consistency properties - something that can impose substantial overheads,…

Databases · Computer Science 2015-01-21 Sudip Roy , Lucja Kot , Gabriel Bender , Bailu Ding , Hossein Hojjat , Christoph Koch , Nate Foster , Johannes Gehrke

Advancements in Traffic Processing Using Programmable Hardware Flow Offload

The exponential growth of data traffic and the increasing complexity of networked applications demand effective solutions capable of passively inspecting and analysing the network traffic for monitoring and security purposes. Implementing…

Networking and Internet Architecture · Computer Science 2024-07-24 Luca Deri , Alfredo Cardigliano , Francesco Fusco

Pattern-based Modeling of High-Performance Computing Resilience

With the growing scale and complexity of high-performance computing (HPC) systems, resilience solutions that ensure continuity of service despite frequent errors and component failures must be methodically designed to balance the…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-10 Saurabh Hukerikar , Christian Engelmann

Stream Control Transmission Protocol (SCTP): Robust and Efficient for Data Centre Applications

Due to rapid advancement in modern technology, as one of the major concerns is the stability of business. The organizations depend on their systems to provide robust and faster processing of information for their operations. Efficient data…

Networking and Internet Architecture · Computer Science 2013-12-04 Fatma Almajadub , Abdul Razaque , Eman Abdel Fattah

Consistency models in distributed systems: A survey on definitions, disciplines, challenges and applications

The replication mechanism resolves some challenges with big data such as data durability, data access, and fault tolerance. Yet, replication itself gives birth to another challenge known as the consistency in distributed systems.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-12 Hesam Nejati Sharif Aldin , Hossein Deldari , Mohammad Hossein Moattar , Mostafa Razavi Ghods

Serial Parallel Reliability Redundancy Allocation Optimization for Energy Efficient and Fault Tolerant Cloud Computing

Serial-parallel redundancy is a reliable way to ensure service and systems will be available in cloud computing. That method involves making copies of the same system or program, with only one remaining active. When an error occurs, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-08 Gutha Jaya Krishna

Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems

The semantics of HPC storage systems are defined by the consistency models to which they abide. Storage consistency models have been less studied than their counterparts in memory systems, with the exception of the POSIX standard and its…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-03 Chen Wang , Kathryn Mohror , Marc Snir

QStack: Re-architecting User-space Network Stack to Optimize CPU Efficiency and Service Quality

TCP/IP network stack is irreplaceable for Web services in datacenter front-end servers, and the demand for which is growing rapidly for emerging high concurrency network service applications (including Internet, Internet of Things, mobile…

Networking and Internet Architecture · Computer Science 2022-10-18 WL Zhang , YF Shen , H Song , Zh Zhang , K Liu , Q Huang , MY Chen