English
Related papers

Related papers: Accelerating SGD for Distributed Deep-Learning Usi…

200 papers

In this paper, we propose a distributed second- order method for reinforcement learning. Our approach is the fastest in literature so-far as it outperforms state-of-the-art methods, including ADMM, by significant margins. We achieve this by…

Optimization and Control · Mathematics 2016-08-08 Rasul Tutunov , Haitham Bou-Ammar , Ali Jadbabaie

Deep learning involves a difficult non-convex optimization problem with a large number of weights between any two adjacent layers of a deep structure. To handle large data sets or complicated networks, distributed training is needed, but…

We consider distributed optimization problems where networked nodes cooperatively minimize the sum of their locally known convex costs. A popular class of methods to solve these problems are the distributed gradient methods, which are…

Information Theory · Computer Science 2017-02-21 Dragana Bajovic , Dusan Jakovetic , Natasa Krejic , Natasa Krklec Jerinkic

This paper proposes a novel distributed semismooth Newton based augmented Lagrangian method for solving a class of optimization problems over networks, where the global objective is defined as the sum of locally held cost functions, and…

Optimization and Control · Mathematics 2026-03-02 Qihao Ma , Chengjing Wang , Peipei Tang , Dunbiao Niu , Aimin Xu

First-order methods like stochastic gradient descent(SGD) are recently the popular optimization method to train deep neural networks (DNNs), but second-order methods are scarcely used because of the overpriced computing cost in getting the…

Machine Learning · Computer Science 2021-04-01 Jingcheng Zhou , Wei Wei , Zhiming Zheng

We study distributed algorithms for expected loss minimization where the datasets are large and have to be stored on different machines. Often we deal with minimizing the average of a set of convex functions where each function is the…

Machine Learning · Computer Science 2019-07-24 Samira Sheikhi

In a multi-agent network, we consider the problem of minimizing an objective function that is expressed as the sum of private convex and smooth functions, and a (possibly) non-differentiable convex regularizer. We propose a novel…

Optimization and Control · Mathematics 2021-09-30 Yichuan Li , Nikolaos M. Freris , Petros Voulgaris , Dusan Stipanovic

Training deep neural network is a high dimensional and a highly non-convex optimization problem. Stochastic gradient descent (SGD) algorithm and it's variations are the current state-of-the-art solvers for this task. However, due to…

Machine Learning · Computer Science 2017-01-17 Xi He , Dheevatsa Mudigere , Mikhail Smelyanskiy , Martin Takáč

We propose a continuous-time second-order optimization algorithm for solving unconstrained convex optimization problems with bounded Hessian. We show that this alternative algorithm has a comparable convergence rate to that of the…

Optimization and Control · Mathematics 2021-05-21 Hossein Moradian , Solmaz S. Kia

In this work, we propose Natural Hypergradient Descent (NHGD), a new method for solving bilevel optimization problems. To address the computational bottleneck in hypergradient estimation--namely, the need to compute or approximate Hessian…

Machine Learning · Computer Science 2026-04-02 Deyi Kong , Zaiwei Chen , Shuzhong Zhang , Shancong Mou

Large scale optimization problems are ubiquitous in machine learning and data analysis and there is a plethora of algorithms for solving such problems. Many of these algorithms employ sub-sampling, as a way to either speed up the…

Optimization and Control · Mathematics 2016-02-29 Farbod Roosta-Khorasani , Michael W. Mahoney

Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selections so as to increase rewarding experiences in their environments. Deep Reinforcement Learning algorithms require solving a nonconvex and…

Machine Learning · Computer Science 2019-04-18 Jacob Rafati , Roummel F. Marcia

This paper considers distributed optimization problems, where each agent cooperatively minimizes the sum of local objective functions through the communication with its neighbors. The widely adopted distributed gradient method in solving…

Optimization and Control · Mathematics 2025-08-19 Yeming Xu , Ziyuan Guo , Kaihong Lu , Huanshui Zhang

Due to the rapid growth of data and computational resources, distributed optimization has become an active research area in recent years. While first-order methods seem to dominate the field, second-order methods are nevertheless attractive…

Machine Learning · Computer Science 2018-06-21 Celestine Dünner , Aurelien Lucchi , Matilde Gargiani , An Bian , Thomas Hofmann , Martin Jaggi

Due to the rapidly growing scale and heterogeneity of wireless networks, the design of distributed cross-layer optimization algorithms have received significant interest from the networking research community. So far, the standard…

Networking and Internet Architecture · Computer Science 2016-11-18 Jia Liu , Cathy H. Xia , Ness B. Shroff , Hanif D. Sherali

Second-order methods for neural network optimization have several advantages over methods based on first-order gradient descent, including better scaling to large mini-batch sizes and fewer updates needed for convergence. But they are…

Machine Learning · Computer Science 2017-12-21 Huishuai Zhang , Caiming Xiong , James Bradbury , Richard Socher

The paper studies the solution of stochastic optimization problems in which approximations to the gradient and Hessian are obtained through subsampling. We first consider Newton-like methods that employ these approximations and discuss how…

Optimization and Control · Mathematics 2016-09-28 Raghu Bollapragada , Richard Byrd , Jorge Nocedal

In this paper, we propose a distributed Newton method for consensus optimization. Our approach outperforms state-of-the-art methods, including ADMM. The key idea is to exploit the sparsity of the dual Hessian and recast the computation of…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-22 Rasul Tutunov , Haitham Bou Ammar , Ali Jadbabaie

Dual descent methods are commonly used to solve network optimization problems because their implementation can be distributed through the network. However, their convergence rates are typically very slow. This paper introduces a family of…

Optimization and Control · Mathematics 2011-04-07 M. Zargham , A. Ribeiro , A. Jadbabaie , A. Ozdaglar

Accelerating the convergence of second-order optimization, particularly Newton-type methods, remains a pivotal challenge in algorithmic research. In this paper, we extend previous work on the \textbf{Quadratic Gradient (QG)} and rigorously…

Optimization and Control · Mathematics 2026-04-01 John Chiang
‹ Prev 1 2 3 10 Next ›