Related papers: Differentially Quantized Gradient Methods
Gradient quantization is an emerging technique in reducing communication costs in distributed learning. Existing gradient quantization algorithms often rely on engineering heuristics or empirical observations, lacking a systematic approach…
Large-scale distributed optimization is of great importance in various applications. For data-parallel based distributed learning, the inter-node gradient communication often becomes the performance bottleneck. In this paper, we propose the…
The distributed subgradient method (DSG) is a widely discussed algorithm to cope with large-scale distributed optimization problems in the arising machine learning applications. Most exisiting works on DSG focus on ideal communication…
We study distributed optimization problems over a network when the communication between the nodes is constrained, and so information that is exchanged between the nodes must be quantized. Recent advances using the distributed gradient…
We consider the problem of solving a distributed optimization problem using a distributed computing platform, where the communication in the network is limited: each node can only communicate with its neighbours and the channel has a…
Stochastic gradient descent (SGD) is a widely adopted iterative method for optimizing differentiable objective functions. In this paper, we propose and discuss a novel approach to scale up SGD in applications involving non-convex functions…
Communication efficiency is a major bottleneck in the applications of distributed networks. To address the problem, the problem of quantized distributed optimization has attracted a lot of attention. However, most of the existing quantized…
We consider a decentralized learning problem, where a set of computing nodes aim at solving a non-convex optimization problem collaboratively. It is well-known that decentralized optimization schemes face two major system bottlenecks:…
In this paper, we consider the unconstrained distributed optimization problem, in which the exchange of information in the network is captured by a directed graph topology, thus, nodes can only communicate with their neighbors.…
We consider the problem of decentralized consensus optimization, where the sum of $n$ smooth and strongly convex functions are minimized over $n$ distributed agents that form a connected network. In particular, we consider the case that the…
In this paper, we establish new convergence results for the quantized distributed gradient descent and suggest a novel strategy of choosing the stepsizes for the high-performance of the algorithm. Under the strongly convexity assumption on…
Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to excellent scalability properties of this algorithm, and to its efficiency in the context of training deep neural networks.…
In this paper, we study unconstrained distributed optimization strongly convex problems, in which the exchange of information in the network is captured by a directed graph topology over digital channels that have limited capacity (and…
Training generative adversarial networks (GAN) in a distributed fashion is a promising technology since it is contributed to training GAN on a massive of data efficiently in real-world applications. However, GAN is known to be difficult to…
To address the communication bottleneck challenge in distributed learning, our work introduces a novel two-stage quantization strategy designed to enhance the communication efficiency of distributed Stochastic Gradient Descent (SGD). The…
Gradient compression has surfaced as a key technique to address the challenge of communication efficiency in distributed learning. In distributed deep learning, however, it is observed that gradient distributions are heavy-tailed, with…
This paper considers the problem of decentralized optimization on compact submanifolds, where a finite sum of smooth (possibly non-convex) local functions is minimized by $n$ agents forming an undirected and connected graph. However, the…
Stochastic gradient descent (SGD) algorithm and its variations have been effectively used to optimize neural network models. However, with the rapid growth of big data and deep learning, SGD is no longer the most suitable choice due to its…
Gradient Descent (GD) is a ubiquitous algorithm for finding the optimal solution to an optimization problem. For reduced computational complexity, the optimal solution $\mathrm{x^*}$ of the optimization problem must be attained in a minimum…
Variational quantum algorithms (VQAs) are promising methods that leverage noisy quantum computers and classical computing techniques for practical applications. In VQAs, the classical optimizers such as gradient-based optimizers are…