Related papers: Byzantine-Resilient Gradient Coding through Local …

Trading Communication for Computation in Byzantine-Resilient Gradient Coding

We consider gradient coding in the presence of an adversary, controlling so-called malicious workers trying to corrupt the computations. Previous works propose the use of MDS codes to treat the inputs of the malicious workers as errors and…

Information Theory · Computer Science 2023-06-06 Christoph Hofmeister , Luis Maßny , Eitan Yaakobi , Rawad Bitar

Interactive Byzantine-Resilient Gradient Coding for General Data Assignments

We tackle the problem of Byzantine errors in distributed gradient descent within the Byzantine-resilient gradient coding framework. Our proposed solution can recover the exact full gradient in the presence of $s$ malicious workers with a…

Information Theory · Computer Science 2024-01-31 Shreyas Jain , Luis Maßny , Christoph Hofmeister , Eitan Yaakobi , Rawad Bitar

Data Encoding for Byzantine-Resilient Distributed Optimization

We study distributed optimization in the presence of Byzantine adversaries, where both data and computation are distributed among $m$ worker machines, $t$ of which may be corrupt. The compromised nodes may collaboratively and arbitrarily…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-05 Deepesh Data , Linqi Song , Suhas Diggavi

Detection and Mitigation of Byzantine Attacks in Distributed Training

A plethora of modern machine learning tasks require the utilization of large-scale distributed clusters as a critical component of the training pipeline. However, abnormal Byzantine behavior of the worker nodes can derail the training and…

Machine Learning · Computer Science 2023-05-16 Konstantinos Konstantinidis , Namrata Vaswani , Aditya Ramamoorthy

Nested Gradient Codes for Straggler Mitigation in Distributed Machine Learning

We consider distributed learning in the presence of slow and unresponsive worker nodes, referred to as stragglers. In order to mitigate the effect of stragglers, gradient coding redundantly assigns partial computations to the worker such…

Information Theory · Computer Science 2022-12-19 Luis Maßny , Christoph Hofmeister , Maximilian Egger , Rawad Bitar , Antonia Wachter-Zeh

Gradient Coding Based on Block Designs for Mitigating Adversarial Stragglers

Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, suffer from slow running machines, called 'stragglers'. Gradient coding is a coding-theoretic framework to…

Information Theory · Computer Science 2019-05-01 Swanand Kadhe , O. Ozan Koyluoglu , Kannan Ramchandran

Gradient Coding from Cyclic MDS Codes and Expander Graphs

Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic MDS codes, which compare favorably with existing…

Information Theory · Computer Science 2019-07-09 Netanel Raviv , Itzhak Tamo , Rashish Tandon , Alexandros G. Dimakis

Aspis: Robust Detection for Distributed Learning

State-of-the-art machine learning models are routinely trained on large-scale distributed clusters. Crucially, such systems can be compromised when some of the computing devices exhibit abnormal (Byzantine) behavior and return arbitrary…

Machine Learning · Computer Science 2022-01-25 Konstantinos Konstantinidis , Aditya Ramamoorthy

Byzantine-Resilient Distributed Computation via Task Replication and Local Computations

We study a distributed computation problem in the presence of Byzantine workers where a central node wishes to solve a task that is divided into independent sub-tasks, each of which needs to be solved correctly. The distributed computation…

Information Theory · Computer Science 2025-07-23 Aayush Rajesh , Nikhil Karamchandani , Vinod M. Prabhakaran

Byzantine Fault Tolerance of Regenerating Codes

Recent years have witnessed a slew of coding techniques custom designed for networked storage systems. Network coding inspired regenerating codes are the most prolifically studied among these new age storage centric codes. A lot of effort…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-06-14 Frédérique Oggier , Anwitaman Datta

Resilient Two-Time-Scale Local Stochastic Gradient Descent for Byzantine Federated Learning

We study local stochastic gradient descent methods for solving federated optimization over a network of agents communicating indirectly through a centralized coordinator. We are interested in the Byzantine setting where there is a subset of…

Optimization and Control · Mathematics 2024-09-06 Amit Dutta , Thinh T. Doan

Randomized Reactive Redundancy for Byzantine Fault-Tolerance in Parallelized Learning

This report considers the problem of Byzantine fault-tolerance in synchronous parallelized learning that is founded on the parallelized stochastic gradient descent (parallelized-SGD) algorithm. The system comprises a master, and $n$…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-12-23 Nirupam Gupta , Nitin H. Vaidya

Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data

We study distributed stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. We consider the heterogeneous data model, where different workers may have different local datasets, and we do not make any…

Machine Learning · Statistics 2020-05-19 Deepesh Data , Suhas Diggavi

Byzantine-Robust Variance-Reduced Federated Learning over Distributed Non-i.i.d. Data

We consider the federated learning problem where data on workers are not independent and identically distributed (i.i.d.). During the learning process, an unknown number of Byzantine workers may send malicious messages to the central node,…

Machine Learning · Computer Science 2021-08-31 Jie Peng , Zhaoxian Wu , Qing Ling , Tianyi Chen

Efficient Byzantine-Resilient Stochastic Gradient Desce

Distributed Learning often suffers from Byzantine failures, and there have been a number of works studying the problem of distributed stochastic optimization under Byzantine failures, where only a portion of workers, instead of all the…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-08-17 Kaiyun Li , Xiaojun Chen , Ye Dong , Peng Zhang , Dakui Wang , Shuai Zen

Byzantine-robust Federated Learning through Collaborative Malicious Gradient Filtering

Gradient-based training in federated learning is known to be vulnerable to faulty/malicious clients, which are often modeled as Byzantine clients. To this end, previous work either makes use of auxiliary data at parameter server to verify…

Machine Learning · Computer Science 2023-05-02 Jian Xu , Shao-Lun Huang , Linqi Song , Tian Lan

Counteracting Byzantine Adversaries with Network Coding: An Overhead Analysis

Network coding increases throughput and is robust against failures and erasures. However, since it allows mixing of information within the network, a single corrupted packet generated by a Byzantine attacker can easily contaminate the…

Information Theory · Computer Science 2009-03-27 MinJi Kim , Muriel Medard , Joao Barros

Election Coding for Distributed Learning: Protecting SignSGD against Byzantine Attacks

Recent advances in large-scale distributed learning algorithms have enabled communication-efficient training via SignSGD. Unfortunately, a major issue continues to plague distributed learning: namely, Byzantine failures may incur serious…

Information Theory · Computer Science 2020-10-27 Jy-yong Sohn , Dong-Jun Han , Beongjun Choi , Jaekyun Moon

Byzantine Stochastic Gradient Descent

This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the $m$ machines which allegedly compute stochastic gradients every iteration, an $\alpha$-fraction are Byzantine, and can behave…

Machine Learning · Computer Science 2018-03-26 Dan Alistarh , Zeyuan Allen-Zhu , Jerry Li

On Gradient Coding with Partial Recovery

We consider a generalization of the gradient coding framework where a dataset is divided across $n$ workers and each worker transmits to a master node one or more linear combinations of the gradients over its assigned data subsets. Unlike…

Information Theory · Computer Science 2022-05-03 Sahasrajit Sarmasarkar , V. Lalitha , Nikhil Karamchandani