Related papers: OFFO minimization algorithms for second-order opti…
A parametric class of trust-region algorithms for unconstrained nonconvex optimization is considered where the value of the objective function is never computed. The class contains a deterministic version of the first-order Adagrad method…
A new algorithm for smooth constrained optimization is proposed that never computes the value of the problem's objective function and that handles both equality and inequality constraints. The algorithm uses an adaptive switching strategy…
We propose an Adagrad-like algorithm for multi-objective unconstrained optimization that relies on the computation of a common descent direction only. Unlike classical local algorithms for multi-objective optimization, our approach does not…
An algorithm is proposed for solving optimization problems with stochastic objective and deterministic equality and inequality constraints. This algorithm is objective-function-free in the sense that it only uses the objective's gradient…
Two OFFO (Objective-Function Free Optimization) noise tolerant algorithms are presented that handle bound constraints, inexact gradients and use second-order information when available.The first is a multi-level method exploiting a…
A parametric class of trust-region algorithms for constrained nonconvex optimization is analyzed, where the objective function is never computed. By defining appropriate first-order stationarity criteria, we are able to extend the Adagrad…
We consider the problem of minimizing a high-dimensional objective function, which may include a regularization term, using (possibly noisy) evaluations of the function. Such optimization is also called derivative-free, zeroth-order, or…
A very simple first-order algorithm is proposed for solving nonlinear optimization problems with deterministic nonlinear equality constraints. This algorithm adaptively selects steps in the plane tangent to the constraints or steps that…
This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions. Our algorithm achieves $O(\sigma / \sqrt{T})$ convergence when the oracle feedback is stochastic with variance $\sigma^2$,…
A fully stochastic second-order adaptive-regularization method for unconstrained nonconvex optimization is presented which never computes the objective-function value, but yet achieves the optimal $\mathcal{O}(\epsilon^{-3/2})$ complexity…
Zeroth-order optimization is the process of minimizing an objective $f(x)$, given oracle access to evaluations at adaptively chosen inputs $x$. In this paper, we present two simple yet powerful GradientLess Descent (GLD) algorithms that do…
An algorithm for unconstrained non-convex optimization is described, which does not evaluate the objective function and in which minimization is carried out, at each iteration, within a randomly selected subspace. It is shown that this…
We consider the case of derivative-free algorithms for non-convex optimization, also known as zero order algorithms, that use only function evaluations rather than gradients. For a wide variety of gradient approximators based on finite…
The graduated optimization approach, also known as the continuation method, is a popular heuristic to solving non-convex problems that has received renewed interest over the last decade. Despite its popularity, very little is known in terms…
Two classes of algorithms for optimization in the presence of noise are presented, that do not require the evaluation of the objective function. The first generalizes the well-known Adagrad method. Its complexity is then analyzed as a…
Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent. Second-order optimization methods, that involve second derivatives and/or second…
An adaptive regularization algorithm for unconstrained nonconvex optimization is presented in which the objective function is never evaluated, but only derivatives are used. This algorithm belongs to the class of adaptive regularization…
The analysis of gradient descent-type methods typically relies on the Lipschitz continuity of the objective gradient. This generally requires an expensive hyperparameter tuning process to appropriately calibrate a stepsize for a given…
ADAGB2, a generalization of the Adagrad algorithm for stochastic optimization is introduced, which is also applicable to bound-constrained problems and capable of using second-order information when available. It is shown that, given…
Zeroth-order (derivative-free) optimization attracts a lot of attention in machine learning, because explicit gradient calculations may be computationally expensive or infeasible. To handle large scale problems both in volume and dimension,…