Related papers: Randomized Block-Diagonal Preconditioning for Para…
In recent years, there is a growing need to train machine learning models on a huge volume of data. Designing efficient distributed optimization algorithms for empirical risk minimization (ERM) has therefore become an active and challenging…
Based on a preconditioned version of the randomized block-coordinate forward-backward algorithm recently proposed in [Combettes,Pesquet,2014], several variants of block-coordinate primal-dual algorithms are designed in order to solve a wide…
In this paper, we further investigate and refine the subspace-constrained preconditioning technique to enhance the theoretical and numerical convergence properties of randomized iterative methods for solving linear systems. In particular,…
Block-coordinate algorithms are recognized to furnish efficient iterative schemes for addressing large-scale problems, especially when the computation of full derivatives entails substantial memory requirements and computational efforts. In…
Preconditioning has long been a staple technique in optimization, often applied to reduce the condition number of a matrix and speed up the convergence of algorithms. Although there are many popular preconditioning techniques in practice,…
Modern adaptive optimization methods, such as Adam and its variants, have emerged as the most widely used tools in deep learning over recent years. These algorithms offer automatic mechanisms for dynamically adjusting the update step based…
The efficient solution of moderately large-scale linear systems arising from the KKT conditions in optimal control problems (OCPs) is a critical challenge in robotics. With the stagnation of Moore's law, there is growing interest in…
We consider the problem of minimizing block-separable convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both…
We present a methodology for parallel acceleration of learning in the presence of matrix orthogonality and unitarity constraints of interest in several branches of machine learning. We show how an apparently sequential elementary rotation…
We study the problem of minimizing the sum of potentially non-differentiable convex cost functions with partially overlapping dependences in an asynchronous manner, where communication in the network is not coordinated. We study the…
We consider stochastic convex optimization problems, where several machines act asynchronously in parallel while sharing a common memory. We propose a robust training method for the constrained setting and derive non asymptotic convergence…
We present and analyze a parallel implementation of a parallel-in-time collocation method based on $\alpha$-circulant preconditioned Richardson iterations. While many papers explore this family of single-level, time-parallel "all-at-once"…
We present a parallelized primal-dual algorithm for solving constrained convex optimization problems. The algorithm is "block-based," in that vectors of primal and dual variables are partitioned into blocks, each of which is updated only by…
In this work we show that randomized (block) coordinate descent methods can be accelerated by parallelization when applied to the problem of minimizing the sum of a partially separable smooth convex function and a simple separable convex…
In this paper we propose a randomized primal-dual proximal block coordinate updating framework for a general multi-block convex optimization model with coupled objective function and linear constraints. Assuming mere convexity, we establish…
We study the block-coordinate forward-backward algorithm in which the blocks are updated in a random and possibly parallel manner, according to arbitrary probabilities. The algorithm allows different stepsizes along the block-coordinates to…
We present a parallelized primal-dual algorithm for solving constrained convex optimization problems. The algorithm is "block-based," in that vectors of primal and dual variables are partitioned into blocks, each of which is updated only by…
There has been a growing interest in parallel strategies for solving trajectory optimization problems. One key step in many algorithmic approaches to trajectory optimization is the solution of moderately-large and sparse linear systems.…
We propose new sequential sorting operations by adapting techniques and methods used for designing parallel sorting algorithms. Although the norm is to parallelize a sequential algorithm to improve performance, we adapt a contrarian…
Adaptive gradient approaches that automatically adjust the learning rate on a per-feature basis have been very popular for training deep networks. This rich class of algorithms includes Adagrad, RMSprop, Adam, and recent extensions. All…