Related papers: Securing Distributed Gradient Descent in High Dime…
We consider the problem of distributed statistical machine learning in adversarial settings, where some unknown and time-varying subset of working machines may be compromised and behave arbitrarily to prevent an accurate model from being…
In large-scale distributed learning, security issues have become increasingly important. Particularly in a decentralized environment, some computing units may behave abnormally, or even exhibit Byzantine failures -- arbitrary and…
We study distributed stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. We consider the heterogeneous data model, where different workers may have different local datasets, and we do not make any…
We develop a communication-efficient distributed learning algorithm that is robust against Byzantine worker machines. We propose and analyze a distributed gradient-descent algorithm that performs a simple thresholding based on gradient…
Adversarial attacks pose a major challenge to distributed learning systems, prompting the development of numerous robust learning methods. However, most existing approaches suffer from the curse of dimensionality, i.e. the error increases…
Decentralized stochastic gradient descent (D-SGD) is an efficient method for large-scale distributed learning. Existing generalization studies mainly address expected results, achieving rates limited to $\mathcal{O}\left(\frac{1}{\delta…
The recent advances in sensor technologies and smart devices enable the collaborative collection of a sheer volume of data from multiple information sources. As a promising tool to efficiently extract useful information from such big data,…
State-of-the-art machine learning models are routinely trained on large-scale distributed clusters. Crucially, such systems can be compromised when some of the computing devices exhibit abnormal (Byzantine) behavior and return arbitrary…
A plethora of modern machine learning tasks require the utilization of large-scale distributed clusters as a critical component of the training pipeline. However, abnormal Byzantine behavior of the worker nodes can derail the training and…
We study local stochastic gradient descent methods for solving federated optimization over a network of agents communicating indirectly through a centralized coordinator. We are interested in the Byzantine setting where there is a subset of…
Distributed model training needs to be adapted to challenges such as the straggler effect and Byzantine attacks. When coordinating the training process with multiple computing nodes, ensuring timely and reliable gradient aggregation amidst…
Distributed Learning often suffers from Byzantine failures, and there have been a number of works studying the problem of distributed stochastic optimization under Byzantine failures, where only a portion of workers, instead of all the…
Distributed model training is vulnerable to byzantine system failures and adversarial compute nodes, i.e., nodes that use malicious updates to corrupt the global model stored at a parameter server (PS). To guarantee some form of robustness,…
Distributed learning has gained significant attention due to its advantages in scalability, privacy, and fault tolerance.In this paradigm, multiple agents collaboratively train a global model by exchanging parameters only with their…
Distributed learning has emerged as a leading paradigm for training large machine learning models. However, in real-world scenarios, participants may be unreliable or malicious, posing a significant challenge to the integrity and accuracy…
While machine learning is going through an era of celebrated success, concerns have been raised about the vulnerability of its backbone: stochastic gradient descent (SGD). Recent approaches have been proposed to ensure the robustness of…
We consider the federated learning problem where data on workers are not independent and identically distributed (i.i.d.). During the learning process, an unknown number of Byzantine workers may send malicious messages to the central node,…
We tackle the problem of Byzantine errors in distributed gradient descent within the Byzantine-resilient gradient coding framework. Our proposed solution can recover the exact full gradient in the presence of $s$ malicious workers with a…
Distributed machine learning algorithms enable learning of models from datasets that are distributed over a network without gathering the data at a centralized location. While efficient distributed algorithms have been developed under the…
We study robust distributed learning that involves minimizing a non-convex loss function with saddle points. We consider the Byzantine setting where some worker machines have abnormal or even arbitrary and adversarial behavior. In this…