Related papers: Multi-kernel Passive Stochastic Gradient Algorithm…

Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning

Stochastic gradient descent algorithms for training linear and kernel predictors are gaining more and more importance, thanks to their scalability. While various methods have been proposed to speed up their convergence, the model selection…

Machine Learning · Computer Science 2014-06-17 Francesco Orabona

A Robust Adaptive Stochastic Gradient Method for Deep Learning

Stochastic gradient algorithms are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms. The convergence of SGD depends on the careful choice of…

Machine Learning · Computer Science 2017-03-03 Caglar Gulcehre , Jose Sotelo , Marcin Moczulski , Yoshua Bengio

ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient

Stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in machine learning. The convergence of SGD depends on the careful choice of learning rate and the amount of the…

Machine Learning · Computer Science 2015-11-03 Caglar Gulcehre , Marcin Moczulski , Yoshua Bengio

Stochastic Gradient Descent Meets Distribution Regression

Stochastic gradient descent (SGD) provides a simple and efficient way to solve a broad range of machine learning problems. Here, we focus on distribution regression (DR), involving two stages of sampling: Firstly, we regress from…

Machine Learning · Statistics 2021-03-08 Nicole Mücke

Enhanced ${q}$-Least Mean Square

In this work, a new class of stochastic gradient algorithm is developed based on $q$-calculus. Unlike the existing $q$-LMS algorithm, the proposed approach fully utilizes the concept of $q$-calculus by incorporating time-varying $q$…

Optimization and Control · Mathematics 2018-01-03 Shujaat Khan , Alishba Sadiq , Imran Naseem , Roberto Togneri , Mohammed Bennamoun

Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes

We consider stochastic gradient descent (SGD) for least-squares regression with potentially several passes over the data. While several passes have been widely reported to perform practically better in terms of predictive performance on…

Machine Learning · Computer Science 2018-11-26 Loucas Pillaud-Vivien , Alessandro Rudi , Francis Bach

Stochastic Variational Deep Kernel Learning

Deep kernel learning combines the non-parametric flexibility of kernel methods with the inductive biases of deep learning architectures. We propose a novel deep kernel learning model and stochastic variational inference procedure which…

Machine Learning · Statistics 2016-11-03 Andrew Gordon Wilson , Zhiting Hu , Ruslan Salakhutdinov , Eric P. Xing

Tracking the Median of Gradients with a Stochastic Proximal Point Method

There are several applications of stochastic optimization where one can benefit from a robust estimate of the gradient. For example, domains such as distributed learning with corrupted nodes, the presence of large outliers in the training…

Machine Learning · Statistics 2025-10-30 Fabian Schaipp , Guillaume Garrigos , Umut Simsekli , Robert Gower

Scaling transition from momentum stochastic gradient descent to plain stochastic gradient descent

The plain stochastic gradient descent and momentum stochastic gradient descent have extremely wide applications in deep learning due to their simple settings and low computational complexity. The momentum stochastic gradient descent uses…

Machine Learning · Computer Science 2021-06-15 Kun Zeng , Jinlan Liu , Zhixia Jiang , Dongpo Xu

On Distributed Non-convex Optimization: Projected Subgradient Method For Weakly Convex Problems in Networks

The stochastic subgradient method is a widely-used algorithm for solving large-scale optimization problems arising in machine learning. Often these problems are neither smooth nor convex. Recently, Davis et al. [1-2] characterized the…

Optimization and Control · Mathematics 2021-02-25 Shixiang Chen , Alfredo Garcia , Shahin Shahrampour

Quantum Kernel Alignment with Stochastic Gradient Descent

Quantum support vector machines have the potential to achieve a quantum speedup for solving certain machine learning problems. The key challenge for doing so is finding good quantum kernels for a given data set -- a task called kernel…

Quantum Physics · Physics 2023-12-08 Gian Gentinetta , David Sutter , Christa Zoufal , Bryce Fuller , Stefan Woerner

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

Many practical perception systems exist within larger processes that include interactions with users or additional components capable of evaluating the quality of predicted solutions. In these contexts, it is beneficial to provide these…

Computer Vision and Pattern Recognition · Computer Science 2016-10-06 Stefan Lee , Senthil Purushwalkam , Michael Cogswell , Viresh Ranjan , David Crandall , Dhruv Batra

Adaptive Consensus Gradients Aggregation for Scaled Distributed Training

Distributed machine learning has recently become a critical paradigm for training large models on vast datasets. We examine the stochastic optimization problem for deep learning within synchronous parallel computing environments under…

Machine Learning · Computer Science 2024-11-07 Yoni Choukroun , Shlomi Azoulay , Pavel Kisilev

Sparse Multiple Kernel Learning with Geometric Convergence Rate

In this paper, we study the problem of sparse multiple kernel learning (MKL), where the goal is to efficiently learn a combination of a fixed small number of kernels from a large pool that could lead to a kernel classifier with a small…

Machine Learning · Computer Science 2013-02-05 Rong Jin , Tianbao Yang , Mehrdad Mahdavi

Low-depth gradient measurements can improve convergence in variational hybrid quantum-classical algorithms

A broad class of hybrid quantum-classical algorithms known as "variational algorithms" have been proposed in the context of quantum simulation, machine learning, and combinatorial optimization as a means of potentially achieving a quantum…

Quantum Physics · Physics 2021-04-09 Aram Harrow , John Napp

Linearly Convergent Algorithm with Variance Reduction for Distributed Stochastic Optimization

This paper considers a distributed stochastic strongly convex optimization, where agents connected over a network aim to cooperatively minimize the average of all agents' local cost functions. Due to the stochasticity of gradient estimation…

Optimization and Control · Mathematics 2020-02-17 Jinlong Lei , Peng Yi , Jie Chen , Yiguang Hong

Posterior Approximation using Stochastic Gradient Ascent with Adaptive Stepsize

Scalable algorithms of posterior approximation allow Bayesian nonparametrics such as Dirichlet process mixture to scale up to larger dataset at fractional cost. Recent algorithms, notably the stochastic variational inference performs local…

Machine Learning · Computer Science 2025-02-25 Kart-Leong Lim , Xudong Jiang

Stochastic Gradients under Nuisances

Stochastic gradient optimization is the dominant learning paradigm for a variety of scenarios, from classical supervised learning to modern self-supervised learning. We consider stochastic gradient algorithms for learning problems whose…

Machine Learning · Statistics 2025-08-29 Facheng Yu , Ronak Mehta , Alex Luedtke , Zaid Harchaoui

A New Stochastic Approximation Method for Gradient-based Simulated Parameter Estimation

This paper tackles the challenge of parameter calibration in stochastic models, particularly in scenarios where the likelihood function is unavailable in an analytical form. We introduce a gradient-based simulated parameter estimation…

Machine Learning · Statistics 2025-03-25 Zehao Li , Yijie Peng

Approximate Stochastic Subgradient Estimation Training for Support Vector Machines

Subgradient algorithms for training support vector machines have been quite successful for solving large-scale and online learning problems. However, they have been restricted to linear kernels and strongly convex formulations. This paper…

Machine Learning · Computer Science 2011-11-04 Sangkyun Lee , Stephen J. Wright