Related papers: Temporal Predictive Coding for Gradient Compressio…
Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently…
Communication overhead severely hinders the scalability of distributed machine learning systems. Recently, there has been a growing interest in using gradient compression to reduce the communication overhead of the distributed training.…
Due to limited communication resources at the client and a massive number of model parameters, large-scale distributed learning tasks suffer from communication bottleneck. Gradient compression is an effective method to reduce communication…
We consider large scale distributed optimization over a set of edge devices connected to a central server, where the limited communication bandwidth between the server and edge devices imposes a significant bottleneck for the optimization…
Communication overhead is the key challenge for distributed training. Gradient compression is a widely used approach to reduce communication traffic. When combining with parallel communication mechanism method like pipeline, gradient…
This paper considers the multi-agent linear least-squares problem in a server-agent network. In this problem, the system comprises multiple agents, each having a set of local data points, that are connected to a server. The goal for the…
Coded distributed computation has become common practice for performing gradient descent on large datasets to mitigate stragglers and other faults. This paper proposes a novel algorithm that encodes the partial derivatives themselves and…
In recent years, distributed optimization is proven to be an effective approach to accelerate training of large scale machine learning models such as deep neural networks. With the increasing computation power of GPUs, the bottleneck of…
Training deep neural networks on large datasets containing high-dimensional data requires a large amount of computation. A solution to this problem is data-parallel distributed training, where a model is replicated into several…
We consider distributed online learning for joint regret with communication constraints. In this setting, there are multiple agents that are connected in a graph. Each round, an adversary first activates one of the agents to issue a…
This paper proposes a deep learning framework to design distributed compression strategies in which distributed agents need to compress high-dimensional observations of a source, then send the compressed bits via bandwidth limited links to…
This paper studies a distributed multi-agent convex optimization problem. The system comprises multiple agents in this problem, each with a set of local data points and an associated local cost function. The agents are connected to a…
The goal of this thesis is to study the compression problems arising in distributed computing systematically. In the first part of the thesis, we study gradient compression for distributed first-order optimization. We begin by establishing…
We consider machine learning applications that train a model by leveraging data distributed over a trusted network, where communication constraints can create a performance bottleneck. A number of recent approaches propose to overcome this…
Asynchronous computation and gradient compression have emerged as two key techniques for achieving scalability in distributed optimization for large-scale machine learning. This paper presents a unified analysis framework for distributed…
An increasing bottleneck in decentralized optimization is communication. Bigger models and growing datasets mean that decentralization of computation is important and that the amount of information exchanged is quickly growing. While…
This paper considers a distributed stochastic strongly convex optimization, where agents connected over a network aim to cooperatively minimize the average of all agents' local cost functions. Due to the stochasticity of gradient estimation…
Communication compression techniques are of growing interests for solving the decentralized optimization problem under limited communication, where the global objective is to minimize the average of local cost functions over a multi-agent…
Communicating information, like gradient vectors, between computing nodes in distributed and federated learning is typically an unavoidable burden, resulting in scalability issues. Indeed, communication might be slow and costly. Recent…
In this paper we consider online distributed learning problems. Online distributed learning refers to the process of training learning models on distributed data sources. In our setting a set of agents need to cooperatively train a learning…