Related papers: Practically-Self-Stabilizing Virtual Synchrony
Current reconfiguration techniques are based on starting the system in a consistent configuration, in which all participating entities are in their initial state. Starting from that state, the system must preserve consistency as long as a…
We present the first self-stabilizing consensus and replicated state machine for asynchronous message passing systems. The scheme does not require that all participants make a certain number of steps prior to reaching a practically infinite…
Self-stabilization is a versatile fault-tolerance approach that characterizes the ability of a system to eventually resume a correct behavior after any finite number of transient faults. In this paper, we propose a self-stabilizing reset…
Feedback control algorithms traditionally rely on periodic execution on digital platforms. While this simplifies design and analysis, it often leads to inefficient resource usage (e.g., CPU, network bandwidth) in embedded control and shared…
Fault tolerance is increasingly important for unmanned autonomous vehicles. For example, in a multi robot system the agents need the ability to effectively detect and tolerate internal failures in order to continue performing their tasks…
Consider a complete communication network of $n$ nodes, where the nodes receive a common clock pulse. We study the synchronous $c$-counting problem: given any starting state and up to $f$ faulty nodes with arbitrary behaviour, the task is…
Vector clock algorithms are basic wait-free building blocks that facilitate causal ordering of events. As wait-free algorithms, they are guaranteed to complete their operations within a finite number of steps. Stabilizing algorithms allow…
Today's hardware technology presents a new challenge in designing robust systems. Deep submicron VLSI technology introduced transient and permanent faults that were never considered in low-level system designs in the past. Still, robustness…
Self-stabilization is a versatile methodology in the design of fault-tolerant distributed algorithms for transient faults. A self-stabilizing system automatically recovers from any kind and any finite number of transient faults. This…
Power systems, including synchronous generator systems, are typical systems that strive for stable operation. In this article, we numerically study the fault transient process of a synchronous generator system based on the first benchmark…
Synchronous Counting is the task of reaching agreement on a common round counter in a synchronous system of $n$ nodes with up to $t$ Byzantine faults in a self-stabilizing manner. That is, after transient faults may have arbitrarily…
State machine replication is standard approach to fault tolerance. One of the key assumptions of state machine replication is that replicas must execute operations deterministically and thus serially. To benefit from multi-core servers,…
Numerous distributed applications, such as cloud computing and distributed ledgers, necessitate the system to invoke asynchronous consensus objects an unbounded number of times, where the completion of one consensus instance is followed by…
Distributed fault-tolerance can mask the effect of a limited number of permanent faults, while self-stabilization provides forward recovery after an arbitrary number of transient fault hit the system. FTSS protocols combine the best of both…
The problem of total-order (uniform reliable) broadcast is fundamental in fault-tolerant distributed computing since it abstracts a broad set of problems requiring processes to uniformly deliver messages in the same order in which they were…
Consider a complete communication network of $n$ nodes, where the nodes receive a common clock pulse. We study the synchronous $c$-counting problem: given any starting state and up to $f$ faulty nodes with arbitrary behaviour, the task is…
In this paper, we introduce an SMT-based method that automatically synthesizes a distributed self-stabilizing protocol from a given high-level specification and network topology. Unlike existing approaches, where synthesis algorithms…
The problem of multivalued consensus is fundamental in the area of fault-tolerant distributed computing since it abstracts a very broad set of agreement problems in which processes have to uniformly decide on a specific value v in V, where…
We give fault-tolerant algorithms for establishing synchrony in distributed systems in which each of the $n$ nodes has its own clock. Our algorithms operate in a very strong fault model: we require self-stabilisation, i.e., the initial…
Interactive consistency is the problem in which n nodes, where up to t may be byzantine, each with its own private value, run an algorithm that allows all non-faulty nodes to infer the values of each other node. This problem is relevant to…