Distributed Proximal Gradient Algorithm for Partially Asynchronous Computer Clusters

Yi Zhou; Yaoliang Yu; Wei Dai; Yingbin Liang; Eric P. Xing

Distributed Proximal Gradient Algorithm for Partially Asynchronous Computer Clusters

Optimization and Control 2017-04-13 v1

Authors: Yi Zhou , Yaoliang Yu , Wei Dai , Yingbin Liang , Eric P. Xing

Abstract

With ever growing data volume and model size, an error-tolerant, communication efficient, yet versatile distributed algorithm has become vital for the success of many large-scale machine learning applications. In this work we propose m-PAPG, an implementation of the flexible proximal gradient algorithm in model parallel systems equipped with the partially asynchronous communication protocol. The worker machines communicate asynchronously with a controlled staleness bound $s$ and operate at different frequencies. We characterize various convergence properties of m-PAPG: 1) Under a general non-smooth and non-convex setting, we prove that every limit point of the sequence generated by m-PAPG is a critical point of the objective function; 2) Under an error bound condition, we prove that the function value decays linearly for every $s$ steps; 3) Under the Kurdyka- ${\L}$ ojasiewicz inequality, we prove that the sequences generated by m-PAPG converge to the same critical point, provided that a proximal Lipschitz condition is satisfied.

Keywords

stochastic gradient descent stochastic optimization network theory

Cite

@article{arxiv.1704.03540,
  title  = {Distributed Proximal Gradient Algorithm for Partially Asynchronous Computer Clusters},
  author = {Yi Zhou and Yaoliang Yu and Wei Dai and Yingbin Liang and Eric P. Xing},
  journal= {arXiv preprint arXiv:1704.03540},
  year   = {2017}
}

Distributed Proximal Gradient Algorithm for Partially Asynchronous Computer Clusters

Abstract

Keywords

Cite

Related papers