Related papers: Adaptive Sequential Optimization with Applications…
A framework previously introduced in [3] for solving a sequence of stochastic optimization problems with bounded changes in the minimizers is extended and applied to machine learning problems such as regression and classification. The…
A framework is introduced for sequentially solving convex stochastic minimization problems, where the objective functions change slowly, in the sense that the distance between successive minimizers is bounded. The minimization problems are…
We propose a new stochastic optimization framework for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an…
Many relevant problems in the area of systems and control, such as controller synthesis, observer design and model reduction, can be viewed as optimization problems involving dynamical systems: for instance, maximizing performance in the…
A framework is introduced for actively and adaptively solving a sequence of machine learning problems, which are changing in bounded manner from one time step to the next. An algorithm is developed that actively queries the labels of the…
Stochastic convex optimization algorithms are the most popular way to train machine learning models on large-scale data. Scaling up the training process of these models is crucial, but the most popular algorithm, Stochastic Gradient Descent…
Stochastic gradient descent (SGD) provides a simple and efficient way to solve a broad range of machine learning problems. Here, we focus on distribution regression (DR), involving two stages of sampling: Firstly, we regress from…
Stochastic gradient algorithms are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms. The convergence of SGD depends on the careful choice of…
In this paper, we consider a general stochastic optimization problem which is often at the core of supervised learning, such as deep learning and linear classification. We consider a standard stochastic gradient descent (SGD) method with a…
Stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in machine learning. The convergence of SGD depends on the careful choice of learning rate and the amount of the…
Stochastic Gradient Descent (SGD) is one of the most widely used techniques for online optimization in machine learning. In this work, we accelerate SGD by adaptively learning how to sample the most useful training examples at each time…
Stochastic gradient descent (SGD) method is popular for solving non-convex optimization problems in machine learning. This work investigates SGD from a viewpoint of graduated optimization, which is a widely applied approach for non-convex…
The representation of functions by artificial neural networks depends on a large number of parameters in a non-linear fashion. Suitable parameters of these are found by minimizing a 'loss functional', typically by stochastic gradient…
We propose a stochastic modified equations (SME) for modeling the asynchronous stochastic gradient descent (ASGD) algorithms. The resulting SME of Langevin type extracts more information about the ASGD dynamics and elucidates the…
We propose a stochastic optimization method for minimizing loss functions, expressed as an expected value, that adaptively controls the batch size used in the computation of gradient approximations and the step size used to move along such…
The stochastic gradient descent (SGD) method is a widely used approach for solving stochastic optimization problems, but its convergence is typically slow. Existing variance reduction techniques, such as SAGA, improve convergence by…
Stochastic gradient descent (SGD) has been a go-to algorithm for nonconvex stochastic optimization problems arising in machine learning. Its theory however often requires a strong framework to guarantee convergence properties. We hereby…
Stochastic gradient descent (SGD) is a widely used algorithm in machine learning, particularly for neural network training. Recent studies on SGD for canonical quadratic optimization or linear regression show it attains well generalization…
Stochastic optimization algorithms with variance reduction have proven successful for minimizing large finite sums of functions. Unfortunately, these techniques are unable to deal with stochastic perturbations of input data, induced for…
Recent work has established an empirically successful framework for adapting learning rates for stochastic gradient descent (SGD). This effectively removes all needs for tuning, while automatically reducing learning rates over time on…