Related papers: Self-Stabilizing Snapshot Objects for Asynchronous…

Snap-Stabilization in Message-Passing Systems

In this paper, we tackle the open problem of snap-stabilization in message-passing systems. Snap-stabilization is a nice approach to design protocols that withstand transient faults. Compared to the well-known self-stabilizing approach,…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-09-29 Sylvie Delaët , Stéphane Devismes , Mikhail Nesterenko , Sébastien Tixeuil

A cooperative partial snapshot algorithm for checkpoint-rollback recovery of large-scale and dynamic distributed systems and experimental evaluations

A distributed system consisting of a huge number of computational entities is prone to faults, because faults in a few nodes cause the entire system to fail. Consequently, fault tolerance of distributed systems is a critical issue.…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-30 Junya Nakamura , Yonghwan Kim , Yoshiaki Katayama , Toshimitsu Masuzawa

Amortized Constant Round Atomic Snapshot in Message-Passing Systems

We study the lattice agreement (LA) and atomic snapshot problems in asynchronous message-passing systems where up to $f$ nodes may crash. Our main result is a crash-tolerant atomic snapshot algorithm with \textit{amortized constant round…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-01 Vijay Garg , Saptaparni Kumar , Lewis Tseng , Xiong Zheng

Self-stabilizing Total-order Broadcast

The problem of total-order (uniform reliable) broadcast is fundamental in fault-tolerant distributed computing since it abstracts a broad set of problems requiring processes to uniformly deliver messages in the same order in which they were…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-30 Oskar Lundström , Michel Raynal , Elad Michael Schiller

Ressource Efficient Stabilization for Local Tasks despite Unknown Capacity Links

Self-stabilizing protocols enable distributed systems to recover correct behavior starting from any arbitrary configuration. In particular, when processors communicate by message passing, fake messages may be placed in communication links…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-14 Lélia Blin , Anaïs Durand , Sébastien Tixeuil

Self-Stabilizing and Private Distributed Shared Atomic Memory in Seldomly Fair Message Passing Networks

We study the problem of privately emulating shared memory in message-passing networks. The system includes clients that store and retrieve replicated information on N servers, out of which e are malicious. When a client access a malicious…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-12 Shlomi Dolev , Thomas Petig , Elad Michael Schiller

Self-Stabilizing Indulgent Zero-degrading Binary Consensus

Guerraoui proposed an indulgent solution for the binary consensus problem. Namely, he showed that an arbitrary behavior of the failure detector never violates safety requirements even if it compromises liveness. Consensus implementations…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-13 Oskar Lundström , Michel Raynal , Elad Michael Schiller

Two snap-stabilizing point-to-point communication protocols in message-switched networks

A snap-stabilizing protocol, starting from any configuration, always behaves according to its specification. In this paper, we present a snap-stabilizing protocol to solve the message forwarding problem in a message-switched network. In…

Data Structures and Algorithms · Computer Science 2009-05-18 Alain Cournier , Swan Dubois , Vincent Villain

Self-stabilizing Uniform Reliable Broadcast

We study a well-known communication abstraction called Uniform Reliable Broadcast (URB). URB is central in the design and implementation of fault-tolerant distributed systems, as many non-trivial fault-tolerant distributed applications…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-13 Oskar Lundström , Michel Raynal , Elad M. Schiller

Self-Stabilizing Distributed Cooperative Reset

Self-stabilization is a versatile fault-tolerance approach that characterizes the ability of a system to eventually resume a correct behavior after any finite number of transient faults. In this paper, we propose a self-stabilizing reset…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-04-23 Stéphane Devismes , Colette Johnen

Snap-Stabilizing Tasks in Anonymous Networks

We consider snap-stabilizing algorithms in anonymous networks. Self-stabilizing algorithms are well known fault tolerant algorithms : a self-stabilizing algorithm will eventually recover from arbitrary transient faults. On the other hand,…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-14 Emmanuel Godard

$t$-Resilient $k$-Immediate Snapshot and its Relation with Agreement Problems

An immediate snapshot object is a high level communication object, built on top of a read/write distributed system in which all except one processes may crash. It allows a process to write a value and obtain a set of values that represent a…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-02 Carole Delporte , Hugues Fauconnier , Sergio Rajsbaum , Michel Raynal

Asynchronous Latency and Fast Atomic Snapshot

This paper introduces a novel, fast atomic-snapshot protocol for asynchronous message-passing systems. In the process of defining what ``fast'' means exactly, we spot a few interesting issues that arise when conventional time metrics are…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-19 João Paulo Bezerra , Luciano Freitas , Petr Kuznetsov , Matthieu Rambaud

Snap-Stabilizing Linear Message Forwarding

In this paper, we present the first snap-stabilizing message forwarding protocol that uses a number of buffers per node being inde- pendent of any global parameter, that is 4 buffers per link. The protocol works on a linear chain of nodes,…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-06-18 Anissa Lamani , Alain Cournier , Swan Dubois , Franck Petit , Vincent Villain

Self-stabilizing Multivalued Consensus in Asynchronous Crash-prone Systems

The problem of multivalued consensus is fundamental in the area of fault-tolerant distributed computing since it abstracts a very broad set of agreement problems in which processes have to uniformly decide on a specific value v in V, where…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-08 Oskar Lundström , Michel Raynal , Elad Michael Schiller

Self-stabilization Overhead: an Experimental Case Study on Coded Atomic Storage

Shared memory emulation can be used as a fault-tolerant and highly available distributed storage solution or as a low-level synchronization primitive. Attiya, Bar-Noy, and Dolev were the first to propose a single-writer, multi-reader…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-07-27 Chryssis Georgiou , Robert Gustafsson , Andreas Lindhe , Elad M. Schiller

Algorithmes auto-stabilisants pour la construction d'arbres couvrants et la gestion d'entit\'es autonomes

In the context of large-scale networks, the consideration of faults is an evident necessity. This document is focussing on the self-stabilizing approach which aims at conceiving algorithms "repairing themselves" in case of transient faults,…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-10-10 Lélia Blin

Deterministic Fault-Tolerant Distributed Computing in Linear Time and Communication

We develop deterministic algorithms for the problems of consensus, gossiping and checkpointing with nodes prone to failing. Distributed systems are modeled as synchronous complete networks. Failures are represented either as crashes or…

Data Structures and Algorithms · Computer Science 2023-05-22 Bogdan S. Chlebus , Dariusz R. Kowalski , Jan Olkowski

The R(1)W(1) Communication Model for Self-Stabilizing Distributed Algorithms

Self-stabilization is a versatile methodology in the design of fault-tolerant distributed algorithms for transient faults. A self-stabilizing system automatically recovers from any kind and any finite number of transient faults. This…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-07 Hirotsugu Kakugawa , Sayaka Kamei , Masahiro Shibata , Fukuhito Ooshita

Locally Self-Adjusting Skip Graphs

We present a distributed self-adjusting algorithm for skip graphs that minimizes the average routing costs between arbitrary communication pairs by performing topological adaptation to the communication pattern. Our algorithm is fully…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-04-05 Sikder Huq , Sukumar Ghosh