English
Related papers

Related papers: PASSCoDe: Parallel ASynchronous Stochastic dual Co…

200 papers

The implementation of a vast majority of machine learning (ML) algorithms boils down to solving a numerical optimization problem. In this context, Stochastic Gradient Descent (SGD) methods have long proven to provide good results, both in…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-06 Janis Keuper , Franz-Josef Pfreundt

In this paper we propose a novel parallel stochastic coordinate descent (SCD) algorithm with convergence guarantees that exhibits strong scalability. We start by studying a state-of-the-art parallel implementation of SCD and identify…

Machine Learning · Computer Science 2019-11-19 Nikolas Ioannou , Celestine Mendler-Dünner , Thomas Parnell

Randomized coordinate descent (RCD) methods are state-of-the-art algorithms for training linear predictors via minimizing regularized empirical risk. When the number of examples ($n$) is much larger than the number of features ($d$), a…

Optimization and Control · Mathematics 2016-05-31 Dominik Csiba , Peter Richtárik

In prior works, stochastic dual coordinate ascent (SDCA) has been parallelized in a multi-core environment where the cores communicate through shared memory, or in a multi-processor distributed memory environment where the processors…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-03 Soumitra Pal , Tingyang Xu , Tianbao Yang , Sanguthevar Rajasekaran , Jinbo Bi

We consider a generic convex optimization problem associated with regularized empirical risk minimization of linear predictors. The problem structure allows us to reformulate it as a convex-concave saddle point problem. We propose a…

Optimization and Control · Mathematics 2015-09-10 Yuchen Zhang , Lin Xiao

We seek tight bounds on the viable parallelism in asynchronous implementations of coordinate descent that achieves linear speedup. We focus on asynchronous coordinate descent (ACD) algorithms on convex functions which consist of the sum of…

Optimization and Control · Mathematics 2020-08-04 Yun Kuen Cheung , Richard Cole , Yixin Tao

We describe an asynchronous parallel stochastic proximal coordinate descent algorithm for minimizing a composite objective function, which consists of a smooth convex function plus a separable convex function. In contrast to previous…

Optimization and Control · Mathematics 2015-12-14 Ji Liu , Stephen J. Wright

We provide improved parallel approximation algorithms for the important class of packing and covering linear programs. In particular, we present new parallel $\epsilon$-approximate packing and covering solvers which run in…

Data Structures and Algorithms · Computer Science 2015-11-23 Di Wang , Michael Mahoney , Nishanth Mohan , Satish Rao

We describe an asynchronous parallel stochastic coordinate descent algorithm for minimizing smooth unconstrained or separably constrained functions. The method achieves a linear convergence rate on functions that satisfy an essential strong…

Optimization and Control · Mathematics 2014-11-12 Ji Liu , Stephen J. Wright , Christopher Ré , Victor Bittorf , Srikrishna Sridhar

We consider convex-concave saddle point problems with a separable structure and non-strongly convex functions. We propose an efficient stochastic block coordinate descent method using adaptive primal-dual updates, which enables flexible…

Machine Learning · Statistics 2015-11-24 Zhanxing Zhu , Amos J. Storkey

Decentralized optimization has become vital for leveraging distributed data without central control, enhancing scalability and privacy. However, practical deployments face fundamental challenges due to heterogeneous computation speeds and…

Machine Learning · Computer Science 2025-05-16 Yijie Zhou , Shi Pu

Asynchronous stochastic gradient descent (ASGD) is a popular parallel optimization algorithm in machine learning. Most theoretical analysis on ASGD take a discrete view and prove upper bounds for their convergence rates. However, the…

Machine Learning · Statistics 2018-05-09 Li He , Qi Meng , Wei Chen , Zhi-Ming Ma , Tie-Yan Liu

We consider a generic convex-concave saddle point problem with separable structure, a form that covers a wide-ranged machine learning applications. Under this problem structure, we follow the framework of primal-dual updates for saddle…

Machine Learning · Statistics 2015-06-15 Zhanxing Zhu , Amos J. Storkey

Stochastic gradient descent (SGD) is a well known method for regression and classification tasks. However, it is an inherently sequential algorithm at each step, the processing of the current example depends on the parameters learned from…

Machine Learning · Computer Science 2017-05-24 Saeed Maleki , Madanlal Musuvathi , Todd Mytkowicz

In this paper we develop an adaptive dual free Stochastic Dual Coordinate Ascent (adfSDCA) algorithm for regularized empirical risk minimization problems. This is motivated by the recent work on dual free SDCA of Shalev-Shwartz (2016). The…

Optimization and Control · Mathematics 2018-01-26 Xi He , Rachael Tappenden , Martin Takac

Machine learning with big data often involves large optimization models. For distributed optimization over a cluster of machines, frequent communication and synchronization of all model parameters (optimization variables) can be very…

Optimization and Control · Mathematics 2017-10-17 Lin Xiao , Adams Wei Yu , Qihang Lin , Weizhu Chen

We consider the distributed learning problem with data dispersed across multiple workers under the orchestration of a central server. Asynchronous Stochastic Gradient Descent (SGD) has been widely explored in such a setting to reduce the…

Machine Learning · Computer Science 2024-05-28 Xiaolu Wang , Yuchang Sun , Hoi-To Wai , Jun Zhang

The primal-dual distributed optimization methods have broad large-scale machine learning applications. Previous primal-dual distributed methods are not applicable when the dual formulation is not available, e.g. the sum-of-non-convex…

Machine Learning · Computer Science 2017-10-30 Zhouyuan Huo , Heng Huang

Block coordinate descent (BCD) methods and their variants have been widely used in coping with large-scale nonconstrained optimization problems in many fields such as imaging processing, machine learning, compress sensing and so on. For…

Optimization and Control · Mathematics 2018-04-04 Daoli Zhu , Lei Zhao

In the realm of big data and machine learning, data-parallel, distributed stochastic algorithms have drawn significant attention in the present days.~While the synchronous versions of these algorithms are well understood in terms of their…

Optimization and Control · Mathematics 2020-04-07 Atal Narayan Sahu , Aritra Dutta , Aashutosh Tiwari , Peter Richtárik
‹ Prev 1 2 3 10 Next ›