Related papers: Random Function Descent
In empirical risk optimization, it has been observed that stochastic gradient implementations that rely on random reshuffling of the data achieve better performance than implementations that rely on sampling the data uniformly. Recent works…
We analyze the convergence rate of the random reshuffling (RR) method, which is a randomized first-order incremental algorithm for minimizing a finite sum of convex component functions. RR proceeds in cycles, picking a uniformly random…
Random Reshuffling (RR) is an algorithm for minimizing finite-sum functions that utilizes iterative gradient descent steps in conjunction with data reshuffling. Often contrasted with its sibling Stochastic Gradient Descent (SGD), RR is…
Randomly initialized first-order optimization algorithms are the method of choice for solving many high-dimensional nonconvex problems in machine learning, yet general theoretical guarantees cannot rule out convergence to critical points of…
This PhD thesis presents a distributional view of optimization in place of a worst-case perspective. We motivate this view with an investigation of the failure point of classical optimization. Subsequently we consider the optimization of a…
Training neural networks requires optimizing a loss function that may be highly irregular, and in particular neither convex nor smooth. Popular training algorithms are based on stochastic gradient descent with momentum (SGDM), for which…
Online minimization of an unknown convex function over the interval $[0,1]$ is considered under first-order stochastic bandit feedback, which returns a random realization of the gradient of the function at each query point. Without knowing…
In this paper we propose a variant of the random coordinate descent method for solving linearly constrained convex optimization problems with composite objective functions. If the smooth part of the objective function has Lipschitz…
We consider the problem of optimising the expected value of a loss functional over a nonlinear model class of functions, assuming that we have only access to realisations of the gradient of the loss. This is a classical task in statistics,…
Stochastic convex optimization is a basic and well studied primitive in machine learning. It is well known that convex and Lipschitz functions can be minimized efficiently using Stochastic Gradient Descent (SGD). The Normalized Gradient…
This paper presents an algorithmic framework for solving unconstrained stochastic optimization problems using only stochastic function evaluations. We employ central finite-difference based gradient estimation methods to approximate the…
We present a family of algorithms, called descent algorithms, for optimizing convex and non-convex functions. We also introduce a new first-order algorithm, called rescaled gradient descent (RGD), and show that RGD achieves a faster…
We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on…
We consider the problem of approximating a function by an element of a nonlinear manifold which admits a differentiable parametrization, typical examples being neural networks with differentiable activation functions or tensor networks.…
Stochastic (sub)gradient methods require step size schedule tuning to perform well in practice. Classical tuning strategies decay the step size polynomially and lead to optimal sublinear rates on (strongly) convex problems. An alternative…
Gradient descent (GD) and stochastic gradient descent (SGD) are the workhorses of large-scale machine learning. While classical theory focused on analyzing the performance of these methods in convex optimization problems, the most notable…
Finding a local minimum or maximum of a function is often achieved through the gradient-descent optimization method. For a function in dimension d, the gradient requires to compute at each step d partial derivatives. This method is for…
We consider unconstrained randomized optimization of convex objective functions. We analyze the Random Pursuit algorithm, which iteratively computes an approximate solution to the optimization problem by repeated optimization over a…
We consider nonconvex-concave minimax optimization problems of the form $\min_{\bf x}\max_{\bf y\in{\mathcal Y}} f({\bf x},{\bf y})$, where $f$ is strongly-concave in $\bf y$ but possibly nonconvex in $\bf x$ and ${\mathcal Y}$ is a convex…
The graduated optimization approach is a method for finding global optimal solutions for nonconvex functions by using a function smoothing operation with stochastic noise. This paper makes three contributions regarding graduated…