Related papers: Self-Stabilizing Snapshot Objects for Asynchronous…
In this paper, we tackle the open problem of snap-stabilization in message-passing systems. Snap-stabilization is a nice approach to design protocols that withstand transient faults. Compared to the well-known self-stabilizing approach,…
A distributed system consisting of a huge number of computational entities is prone to faults, because faults in a few nodes cause the entire system to fail. Consequently, fault tolerance of distributed systems is a critical issue.…
We study the lattice agreement (LA) and atomic snapshot problems in asynchronous message-passing systems where up to $f$ nodes may crash. Our main result is a crash-tolerant atomic snapshot algorithm with \textit{amortized constant round…
The problem of total-order (uniform reliable) broadcast is fundamental in fault-tolerant distributed computing since it abstracts a broad set of problems requiring processes to uniformly deliver messages in the same order in which they were…
Self-stabilizing protocols enable distributed systems to recover correct behavior starting from any arbitrary configuration. In particular, when processors communicate by message passing, fake messages may be placed in communication links…
We study the problem of privately emulating shared memory in message-passing networks. The system includes clients that store and retrieve replicated information on N servers, out of which e are malicious. When a client access a malicious…
Guerraoui proposed an indulgent solution for the binary consensus problem. Namely, he showed that an arbitrary behavior of the failure detector never violates safety requirements even if it compromises liveness. Consensus implementations…
A snap-stabilizing protocol, starting from any configuration, always behaves according to its specification. In this paper, we present a snap-stabilizing protocol to solve the message forwarding problem in a message-switched network. In…
We study a well-known communication abstraction called Uniform Reliable Broadcast (URB). URB is central in the design and implementation of fault-tolerant distributed systems, as many non-trivial fault-tolerant distributed applications…
Self-stabilization is a versatile fault-tolerance approach that characterizes the ability of a system to eventually resume a correct behavior after any finite number of transient faults. In this paper, we propose a self-stabilizing reset…
We consider snap-stabilizing algorithms in anonymous networks. Self-stabilizing algorithms are well known fault tolerant algorithms : a self-stabilizing algorithm will eventually recover from arbitrary transient faults. On the other hand,…
An immediate snapshot object is a high level communication object, built on top of a read/write distributed system in which all except one processes may crash. It allows a process to write a value and obtain a set of values that represent a…
This paper introduces a novel, fast atomic-snapshot protocol for asynchronous message-passing systems. In the process of defining what ``fast'' means exactly, we spot a few interesting issues that arise when conventional time metrics are…
In this paper, we present the first snap-stabilizing message forwarding protocol that uses a number of buffers per node being inde- pendent of any global parameter, that is 4 buffers per link. The protocol works on a linear chain of nodes,…
The problem of multivalued consensus is fundamental in the area of fault-tolerant distributed computing since it abstracts a very broad set of agreement problems in which processes have to uniformly decide on a specific value v in V, where…
Shared memory emulation can be used as a fault-tolerant and highly available distributed storage solution or as a low-level synchronization primitive. Attiya, Bar-Noy, and Dolev were the first to propose a single-writer, multi-reader…
In the context of large-scale networks, the consideration of faults is an evident necessity. This document is focussing on the self-stabilizing approach which aims at conceiving algorithms "repairing themselves" in case of transient faults,…
We develop deterministic algorithms for the problems of consensus, gossiping and checkpointing with nodes prone to failing. Distributed systems are modeled as synchronous complete networks. Failures are represented either as crashes or…
Self-stabilization is a versatile methodology in the design of fault-tolerant distributed algorithms for transient faults. A self-stabilizing system automatically recovers from any kind and any finite number of transient faults. This…
We present a distributed self-adjusting algorithm for skip graphs that minimizes the average routing costs between arbitrary communication pairs by performing topological adaptation to the communication pattern. Our algorithm is fully…