Related papers: Proactive Service Migration for Long-Running Byzan…
Software-defined networking offers numerous benefits against the legacy networking systems through simplifying the process of network management and reducing the cost of network configuration. Currently, the management of failures in the…
Byzantine fault-tolerant systems have been researched for more than four decades, and although shown possible early, the solutions were impractical for a long time. With PBFT the first practical solution was proposed in 1999 and spawned new…
Replicated services are inherently vulnerable to failures and security breaches. In a long-running system, it is, therefore, indispensable to maintain a reconfiguration mechanism that would replace faulty replicas with correct ones. An…
An active approach to fault tolerance is essential for robot swarms to achieve long-term autonomy. Previous efforts have focused on responding to spontaneous electro-mechanical faults and failures. However, many faults occur gradually over…
Consensus is a fundamental building block for constructing reliable and fault-tolerant distributed services. Many Byzantine fault-tolerant consensus protocols designed for partially synchronous systems adopt a pessimistic approach when…
This paper considers the problem of Byzantine fault tolerance in distributed linear regression in a multi-agent system. However, the proposed algorithms are given for a more general class of distributed optimization problems, of which…
Due to the use of commodity software and hardware, crash-stop and Byzantine failures are likely to be more prevalent in today's large-scale distributed storage systems. Regenerating codes have been shown to be a more efficient way to…
In this paper we demonstrate how stateful Byzantine Fault Tolerant services may be hosted on a Chord ring. The strategy presented is fourfold: firstly a replication scheme that dissociates the maintenance of replicated service state from…
Byzantine fault-tolerant (BFT) systems are able to maintain the availability and integrity of IoT systems, in presence of failure of individual components, random data corruption or malicious attacks. Fault-tolerant systems in general are…
We provide an epistemic logical language and semantics for the modeling and analysis of byzantine fault-tolerant multi-agent systems. This not only facilitates reasoning about the agents' fault status but also supports model updates for…
To improve the overall efficiency and reliability of Byzantine protocols in large sparse networks, we propose a new system assumption for developing multi-scale fault-tolerant systems, with which several kinds of multi-scale Byzantine…
Recent years have witnessed a slew of coding techniques custom designed for networked storage systems. Network coding inspired regenerating codes are the most prolifically studied among these new age storage centric codes. A lot of effort…
We address the challenges of Byzantine-robust training in asynchronous distributed machine learning systems, aiming to enhance efficiency amid massive parallelization and heterogeneous computing resources. Asynchronous systems, marked by…
Smart environment applications demand novel solutions for managing quality of services, especially availability and reliability at run-time. The underlying systems are changing dynamically due to addition and removal of system components,…
Numerous distributed applications, such as cloud computing and distributed ledgers, necessitate the system to invoke asynchronous consensus objects an unbounded number of times, where the completion of one consensus instance is followed by…
Data is critical for the operation of any organization and needs to be protected, especially against attacks that compromise the state of the database. In this paper, we explore an approach based on Byzantine-fault tolerant replicated state…
In this paper, we present a Byzantine fault tolerant distributed commit protocol for transactions running over untrusted networks. The traditional two-phase commit protocol is enhanced by replicating the coordinator and by running a…
Modern distributed systems face growing security threats, as attackers continuously enhance their skills and vulnerabilities span across the entire system stack, from hardware to the application layer. In the system design phase, fault…
Recent Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols increasingly focus on scalability to meet the requirements of distributed ledger technology (DLT). Validating the performance of scalable BFT protocol…
In this paper, we investigate the challenging framework of Byzantine-robust training in distributed machine learning (ML) systems, focusing on enhancing both efficiency and practicality. As distributed ML systems become integral for complex…