Related papers: DADAO: Decoupled Accelerated Decentralized Asynchr…
We propose a distributed first-order augmented Lagrangian (DFAL) algorithm to minimize the sum of composite convex functions, where each term in the sum is a private cost function belonging to a node, and only nodes connected by an edge can…
Training large models with distributed data parallelism (DDP) requires frequent communication of gradients across workers, which can saturate bandwidth. Infrequent communication strategies (e.g., Local SGD) reduce this overhead but, when…
Distributed supply-chain optimization demands algorithms that can cope with unreliable communication, unbounded messaging delays, and geographically dispersed agents while still guaranteeing convergence with provable rates. In this work, we…
This paper proposes a novel family of primal-dual-based distributed algorithms for smooth, convex, multi-agent optimization over networks that uses only gradient information and gossip communications. The algorithms can also employ…
In this paper, we study the problem of minimizing a sum of smooth and strongly convex functions split over the nodes of a network in a decentralized fashion. We propose the algorithm $ESDACD$, a decentralized accelerated algorithm that only…
We present a new class of decentralized first-order methods for nonsmooth and stochastic optimization problems defined over multiagent networks. Considering that communication is a major bottleneck in decentralized optimization, our main…
In this paper, we study the communication and (sub)gradient computation costs in distributed optimization and give a sharp complexity analysis for the proposed distributed accelerated gradient methods. We present two algorithms based on the…
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes. The loss functions of the agents are assumed to be \textit{similar}, due to statistical data similarity or otherwise. In order…
In Part I of this work [1], we developed an accelerated algorithmic framework, DAMA (Decentralized Accelerated Minimax Approach), for nonconvex Polyak-Lojasiewicz (PL) minimax optimization over decentralized multi-agent networks. To further…
In this work and its accompanying Part II [1], we develop an accelerated algorithmic framework, DAMA (Decentralized Accelerated Minimax Approach), for nonconvex Polyak-Lojasiewicz minimax optimization over decentralized multi-agent…
Decentralized optimization, particularly the class of decentralized composite convex optimization (DCCO) problems, has found many applications. Due to ubiquitous communication congestion and random dropouts in practice, it is highly…
In this paper, we determine the optimal convergence rates for strongly convex and smooth distributed optimization in two settings: centralized and decentralized communications over a network. For centralized (i.e. master/slave) algorithms,…
In this work, we introduce an asynchronous decentralized accelerated stochastic gradient descent type of method for decentralized stochastic optimization, considering communication and synchronization are the major bottlenecks. We establish…
In this work, we study decentralized convex constrained optimization problems in networks. We focus on the dual averaging-based algorithmic framework that is well-documented to be superior in handling constraints and complex communication…
In modern decentralized applications, ensuring communication efficiency and privacy for the users are the key challenges. In order to train machine-learning models, the algorithm has to communicate to the data center and sample data for its…
In distributed machine learning, efficient training across multiple agents with different data distributions poses significant challenges. Even with a centralized coordinator, current algorithms that achieve optimal communication complexity…
In this paper, we develop a distributed algorithm for solving a class of distributed convex optimization problems where the local objective functions can be a general nonsmooth function, and all equalities and inequalities are network-wide…
In this work, we introduce ADAPD, $\textbf{A}$ $\textbf{D}$ecentr$\textbf{A}$lized $\textbf{P}$rimal-$\textbf{D}$ual algorithmic framework for solving non-convex and smooth consensus optimization problems over a network of distributed…
This paper develops a continuous-time primal-dual accelerated method with an increasing damping coefficient for a class of convex optimization problems with affine equality constraints. This paper analyzes critical values for parameters in…
Scalable machine learning over big data is an important problem that is receiving a lot of attention in recent years. On popular distributed environments such as Hadoop running on a cluster of commodity machines, communication costs are…