Asynchronous Distributed Optimization with Stochastic Delays
Abstract
We study asynchronous finite sum minimization in a distributed-data setting with a central parameter server. While asynchrony is well understood in parallel settings where the data is accessible by all machines -- e.g., modifications of variance-reduced gradient algorithms like SAGA work well -- little is known for the distributed-data setting. We develop an algorithm ADSAGA based on SAGA for the distributed-data setting, in which the data is partitioned between many machines. We show that with machines, under a natural stochastic delay model with an mean delay of , ADSAGA converges in iterations, where is the number of component functions, and is a condition number. This complexity sits squarely between the complexity of SAGA \textit{without delays} and the complexity of parallel asynchronous algorithms where the delays are \textit{arbitrary} (but bounded by ), and the data is accessible by all. Existing asynchronous algorithms with distributed-data setting and arbitrary delays have only been shown to converge in iterations. We empirically compare on least-squares problems the iteration complexity and wallclock performance of ADSAGA to existing parallel and distributed algorithms, including synchronous minibatch algorithms. Our results demonstrate the wallclock advantage of variance-reduced asynchronous approaches over SGD or synchronous approaches.
Cite
@article{arxiv.2009.10717,
title = {Asynchronous Distributed Optimization with Stochastic Delays},
author = {Margalit Glasgow and Mary Wootters},
journal= {arXiv preprint arXiv:2009.10717},
year = {2021}
}
Comments
arXiv admin note: substantial text overlap with arXiv:2006.09638