Randomized Block-Diagonal Preconditioning for Parallel Learning

Celestine Mendler-Dünner; Aurelien Lucchi

Randomized Block-Diagonal Preconditioning for Parallel Learning

Machine Learning 2020-12-08 v2 Distributed, Parallel, and Cluster Computing Machine Learning

Authors: Celestine Mendler-Dünner , Aurelien Lucchi

Abstract

We study preconditioned gradient-based optimization methods where the preconditioning matrix has block-diagonal form. Such a structural constraint comes with the advantage that the update computation is block-separable and can be parallelized across multiple independent tasks. Our main contribution is to demonstrate that the convergence of these methods can significantly be improved by a randomization technique which corresponds to repartitioning coordinates across tasks during the optimization procedure. We provide a theoretical analysis that accurately characterizes the expected convergence gains of repartitioning and validate our findings empirically on various traditional machine learning tasks. From an implementation perspective, block-separable models are well suited for parallelization and, when shared memory is available, randomization can be implemented on top of existing methods very efficiently to improve convergence.

Keywords

parallel algorithm optimization convex optimization

Cite

@article{arxiv.2006.13591,
  title  = {Randomized Block-Diagonal Preconditioning for Parallel Learning},
  author = {Celestine Mendler-Dünner and Aurelien Lucchi},
  journal= {arXiv preprint arXiv:2006.13591},
  year   = {2020}
}

Comments

improvement in Theorem 3 compared to ICML 2020 version

Randomized Block-Diagonal Preconditioning for Parallel Learning

Abstract

Keywords

Cite

Comments

Related papers