Related papers: Building State Machine Replication Using Practical…
State-machine replication, a fundamental approach to fault tolerance, requires replicas to execute commands deterministically, which usually results in sequential execution of commands. Sequential execution limits performance and underuses…
State machine replication is standard approach to fault tolerance. One of the key assumptions of state machine replication is that replicas must execute operations deterministically and thus serially. To benefit from multi-core servers,…
Developing state-machine replication protocols for practical use is a complex and labor-intensive process because of the myriad of essential tasks (e.g., deployment, communication, recovery) that need to be taken into account in an…
Synchronous computation models simplify the design and the verification of fault-tolerant distributed systems. For efficiency reasons such systems are designed and implemented using an asynchronous semantics. In this paper, we bridge the…
Data replication is crucial in modern distributed systems as a means to provide high availability. Many techniques have been proposed to utilize replicas to improve a system's performance, often requiring expensive coordination or…
With the slowdown of Moore's law, CPU-oriented packet processing in software will be significantly outpaced by emerging line speeds of network interface cards (NICs). Single-core packet-processing throughput has saturated. We consider the…
To ensure high availability in large scale distributed systems, Conflict-free Replicated Data Types (CRDTs) relax consistency by allowing immediate query and update operations at the local replica, with no need for remote synchronization.…
Implementations of state-machine replication (SMR) prevalently use the variants of Paxos. Some of the recent variants of Paxos like, Ring Paxos, Multi-Ring Paxos, S-Paxos and HT-Paxos achieve significantly high throughput. However, to meet…
With the growing performance requirements on networked applications, there is a new trend of offloading stateful network applications to SmartNICs to improve performance and reduce the total cost of ownership. However, offloading stateful…
Paxos is a prominent theory of state machine replication. Recent data intensive Systems those implement state machine replication generally require high throughput. Earlier versions of Paxos as few of them are classical Paxos, fast Paxos…
Modern Internet services commonly replicate critical data across several geographical locations using state-machine replication (SMR). Due to their reliance on a leader replica, classical SMR protocols offer limited scalability and…
Virtual synchrony is an important abstraction that is proven to be extremely useful when implemented over asynchronous, typically large, message-passing distributed systems. Fault tolerant design is a key criterion for the success of such…
State-machine replication, a fundamental approach to designing fault-tolerant services, requires commands to be executed in the same order by all replicas. Moreover, command execution must be deterministic: each replica must produce the…
We study general techniques for implementing distributed data structures on top of future many-core architectures with non cache-coherent or partially cache-coherent memory. With the goal of contributing towards what might become, in the…
In SDN stateful data planes, switches can execute algorithms to process traffic based on local states. This approach permits to offload decisions from the controller to the switches, thus to reduce the latency to react to network events. We…
Grid Computing is a type of parallel and distributed systems that is designed to provide reliable access to data and computational resources in wide area networks. These resources are distributed in different geographical locations, however…
This paper considers the classical state machine replication (SMR) problem in a distributed system model inspired by cross-chain exchanges. We propose a novel SMR protocol adapted for this model. Each state machine transition takes $O(n)$…
A common paradigm for scientific computing is distributed message-passing systems, and a common approach to these systems is to implement them across clusters of high-performance workstations. As multi-core architectures become increasingly…
Distributed algorithms and theories are called for in this era of big data. Under weaker local signal-to-noise ratios, we improve upon the celebrated one-round distributed principal component analysis (PCA) algorithm designed in the spirit…
In this paper, we consider the convergence of a very general asynchronous-parallel algorithm called ARock, that takes many well-known asynchronous algorithms as special cases (gradient descent, proximal gradient, Douglas Rachford, ADMM,…