Related papers: Distributed Mini-Batch SDCA

Accelerated Mini-Batch Stochastic Dual Coordinate Ascent

Stochastic dual coordinate ascent (SDCA) is an effective technique for solving regularized loss minimization problems in machine learning. This paper considers an extension of SDCA under the mini-batch setting that is often used in…

Machine Learning · Statistics 2013-05-14 Shai Shalev-Shwartz , Tong Zhang

Mini-Batch Primal and Dual Methods for SVMs

We address the issue of using mini-batches in stochastic optimization of SVMs. We show that the same quantity, the spectral norm of the data, controls the parallelization speedup obtained for both primal stochastic subgradient descent (SGD)…

Machine Learning · Computer Science 2013-03-12 Martin Takáč , Avleen Bijral , Peter Richtárik , Nathan Srebro

Dual Free Adaptive Minibatch SDCA for Empirical Risk Minimization

In this paper we develop an adaptive dual free Stochastic Dual Coordinate Ascent (adfSDCA) algorithm for regularized empirical risk minimization problems. This is motivated by the recent work on dual free SDCA of Shalev-Shwartz (2016). The…

Optimization and Control · Mathematics 2018-01-26 Xi He , Rachael Tappenden , Martin Takac

Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization

Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA)…

Machine Learning · Statistics 2015-03-20 Shai Shalev-Shwartz , Tong Zhang

Dual Free Adaptive Mini-batch SDCA for Empirical Risk Minimization

In this paper we develop dual free mini-batch SDCA with adaptive probabilities for regularized empirical risk minimization. This work is motivated by recent work of Shai Shalev-Shwartz on dual free SDCA method, however, we allow a…

Optimization and Control · Mathematics 2018-05-25 Xi He , Martin Takáč

SDCA without Duality

Stochastic Dual Coordinate Ascent is a popular method for solving regularized loss minimization for the case of convex losses. In this paper we show how a variant of SDCA can be applied for non-convex losses. We prove linear convergence…

Machine Learning · Computer Science 2015-02-24 Shai Shalev-Shwartz

Data Dependent Convergence for Distributed Stochastic Optimization

In this dissertation we propose alternative analysis of distributed stochastic gradient descent (SGD) algorithms that rely on spectral properties of the data covariance. As a consequence we can relate questions pertaining to speedups and…

Optimization and Control · Mathematics 2016-09-03 Avleen S. Bijral

Analysis of Distributed Stochastic Dual Coordinate Ascent

In \citep{Yangnips13}, the author presented distributed stochastic dual coordinate ascent (DisDCA) algorithms for solving large-scale regularized loss minimization. Extraordinary performances have been observed and reported for the…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-03-25 Tianbao Yang , Shenghuo Zhu , Rong Jin , Yuanqing Lin

Stochastic Dual Coordinate Ascent with Adaptive Probabilities

This paper introduces AdaSDCA: an adaptive variant of stochastic dual coordinate ascent (SDCA) for solving the regularized empirical risk minimization problems. Our modification consists in allowing the method adaptively change the…

Optimization and Control · Mathematics 2015-03-02 Dominik Csiba , Zheng Qu , Peter Richtárik

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

In this paper, we develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches is becoming a golden standard in the…

Optimization and Control · Mathematics 2017-09-20 Tomoya Murata , Taiji Suzuki

SDCA without Duality, Regularization, and Individual Convexity

Stochastic Dual Coordinate Ascent is a popular method for solving regularized loss minimization for the case of convex losses. We describe variants of SDCA that do not require explicit regularization and do not rely on duality. We prove…

Machine Learning · Computer Science 2016-05-24 Shai Shalev-Shwartz

On Data Dependence in Distributed Stochastic Optimization

We study a distributed consensus-based stochastic gradient descent (SGD) algorithm and show that the rate of convergence involves the spectral properties of two matrices: the standard spectral gap of a weight matrix from the network…

Optimization and Control · Mathematics 2016-09-02 Avleen S. Bijral , Anand D. Sarwate , Nathan Srebro

Primal Method for ERM with Flexible Mini-batching Schemes and Non-convex Losses

In this work we develop a new algorithm for regularized empirical risk minimization. Our method extends recent techniques of Shalev-Shwartz [02/2015], which enable a dual-free analysis of SDCA, to arbitrary mini-batching schemes. Moreover,…

Optimization and Control · Mathematics 2015-06-09 Dominik Csiba , Peter Richtárik

Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling

Stochastic Gradient Descent (SGD) is a popular optimization method which has been applied to many important machine learning tasks such as Support Vector Machines and Deep Neural Networks. In order to parallelize SGD, minibatch training is…

Machine Learning · Statistics 2014-05-14 Peilin Zhao , Tong Zhang

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

We propose a new stochastic optimization framework for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an…

Machine Learning · Statistics 2020-02-04 Kenji Kawaguchi , Haihao Lu

Randomized Dual Coordinate Ascent with Arbitrary Sampling

We study the problem of minimizing the average of a large number of smooth convex functions penalized with a strongly convex regularizer. We propose and analyze a novel primal-dual method (Quartz) which at every iteration samples and…

Optimization and Control · Mathematics 2014-11-24 Zheng Qu , Peter Richtárik , Tong Zhang

Batched Stochastic Gradient Descent with Weighted Sampling

We analyze a batched variant of Stochastic Gradient Descent (SGD) with weighted sampling distribution for smooth and non-smooth objective functions. We show that by distributing the batches computationally, a significant speedup in the…

Numerical Analysis · Mathematics 2017-03-02 Deanna Needell , Rachel Ward

Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization

We introduce a proximal version of the stochastic dual coordinate ascent method and show how to accelerate the method using an inner-outer iteration procedure. We analyze the runtime of the framework and obtain rates that improve…

Machine Learning · Statistics 2013-10-09 Shai Shalev-Shwartz , Tong Zhang

Determinantal Point Processes for Mini-Batch Diversification

We study a mini-batch diversification scheme for stochastic gradient descent (SGD). While classical SGD relies on uniformly sampling data points to form a mini-batch, we propose a non-uniform sampling scheme based on the Determinantal Point…

Machine Learning · Computer Science 2017-09-12 Cheng Zhang , Hedvig Kjellstrom , Stephan Mandt

Mini-batch stochastic gradient descent with dynamic sample sizes

We focus on solving constrained convex optimization problems using mini-batch stochastic gradient descent. Dynamic sample size rules are presented which ensure a descent direction with high probability. Empirical results from two…

Optimization and Control · Mathematics 2017-08-03 Michael R. Metel