Related papers: Optimistic Parallel State-Machine Replication
State-machine replication, a fundamental approach to designing fault-tolerant services, requires commands to be executed in the same order by all replicas. Moreover, command execution must be deterministic: each replica must produce the…
State machine replication is standard approach to fault tolerance. One of the key assumptions of state machine replication is that replicas must execute operations deterministically and thus serially. To benefit from multi-core servers,…
Modern Internet services commonly replicate critical data across several geographical locations using state-machine replication (SMR). Due to their reliance on a leader replica, classical SMR protocols offer limited scalability and…
Linearizability is a well-known correctness property for concurrent and distributed systems. In the past, it was also used to prove the design and implementation of replicated state-machines correct. State-machine replication (SMR) is a…
Developing state-machine replication protocols for practical use is a complex and labor-intensive process because of the myriad of essential tasks (e.g., deployment, communication, recovery) that need to be taken into account in an…
With the slowdown of Moore's law, CPU-oriented packet processing in software will be significantly outpaced by emerging line speeds of network interface cards (NICs). Single-core packet-processing throughput has saturated. We consider the…
Distributed systems, such as state machine replication, are critical infrastructures for modern applications. Practical distributed protocols make minimum assumptions about the underlying network: They typically assume a partially…
This paper considers the classical state machine replication (SMR) problem in a distributed system model inspired by cross-chain exchanges. We propose a novel SMR protocol adapted for this model. Each state machine transition takes $O(n)$…
Serial-parallel redundancy is a reliable way to ensure service and systems will be available in cloud computing. That method involves making copies of the same system or program, with only one remaining active. When an error occurs, the…
Consensus, state-machine replication (SMR) and total order broadcast (TOB) protocols are notorious for being poorly scalable with the number of participating nodes. Despite the recent race to reduce overall message complexity of…
State Machine Replication (SMR) is a fundamental approach to designing service with fault tolerance. However, its requirement for the deterministic execution of transactions often results in single-threaded replicas, which cannot fully…
This paper proposes a parallelizable algorithm for linear-quadratic model predictive control (MPC) problems with state and input constraints. The algorithm itself is based on a parallel MPC scheme that has originally been designed for…
This paper focuses on reducing memory usage in enumerative model checking, while maintaining the multi-core scalability obtained in earlier work. We present a tree-based multi-core compression method, which works by leveraging sharing among…
State machine replication (SMR) is a replication technique that ensures fault tolerance by duplicating a service. Geo-replicated SMR is an enhanced version of SMR that distributes replicas in separate geographical locations, making the…
We propose new sequential sorting operations by adapting techniques and methods used for designing parallel sorting algorithms. Although the norm is to parallelize a sequential algorithm to improve performance, we adapt a contrarian…
We present a lightweight solution for state machine replication with commitment certificates. Specifically, we adapt and analyze a median rule for the stabilizing consensus problem [Doerr11] to operate in a client-server setting where…
Reconfigurable state machine replication is an important enabler of elasticity for replicated cloud services, which must be able to dynamically adjust their size as a function of changing load and resource availability. We introduce a new…
We study large-scale distributed cooperative systems that use optimistic replication. We represent a system as a graph of actions (operations) connected by edges that reify semantic constraints between actions. Constraint types include…
One typical use case of large-scale distributed computing in data centers is to decompose a computation job into many independent tasks and run them in parallel on different machines, sometimes known as the "embarrassingly parallel"…
Sequential computation is well understood but does not scale well with current technology. Within the next decade, systems will contain large numbers of processors with potentially thousands of processors per chip. Despite this, many…