Related papers: RES: Regularized Stochastic BFGS Algorithm
Global convergence of an online (stochastic) limited memory version of the Broyden-Fletcher- Goldfarb-Shanno (BFGS) quasi-Newton method for solving optimization problems with stochastic objectives that arise in large scale machine learning…
Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selections so as to increase rewarding experiences in their environments. Deep Reinforcement Learning algorithms require solving a nonconvex and…
Motivated by applications arising from large scale optimization and machine learning, we consider stochastic quasi-Newton (SQN) methods for solving unconstrained convex optimization problems. The convergence analysis of the SQN methods,…
Deep learning algorithms often require solving a highly non-linear and nonconvex unconstrained optimization problem. Methods for solving optimization problems in large-scale machine learning, such as deep learning and deep reinforcement…
In this paper, a modified BFGS algorithm is proposed. The modified BFGS matrix estimates a modified Hessian matrix which is a convex combination of an identity matrix for the steepest descent algorithm and a Hessian matrix for the Newton…
Estimating the Hessian matrix, especially for neural network training, is a challenging problem due to high dimensionality and cost. In this work, we compare the classical Sherman-Morrison update used in the popular BFGS method…
We propose a novel limited-memory stochastic block BFGS update for incorporating enriched curvature information in stochastic approximation methods. In our method, the estimate of the inverse Hessian matrix that is maintained by it, is…
While first-order methods are popular for solving optimization problems that arise in large-scale deep learning problems, they come with some acute deficiencies. To diminish such shortcomings, there has been recent interest in applying…
We present two stochastic descent algorithms that apply to unconstrained optimization and are particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained…
Quasi-Newton methods are ubiquitous in deterministic local search due to their efficiency and low computational cost. This class of methods uses the history of gradient evaluations to approximate second-order derivatives. However, only…
In this paper, we explore the non-asymptotic global convergence rates of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method implemented with exact line search. Notably, due to Dixon's equivalence result, our findings are also applicable to…
Bilevel optimization, addressing challenges in hierarchical learning tasks, has gained significant interest in machine learning. The practical implementation of the gradient descent method to bilevel optimization encounters computational…
We study finite-sum nonconvex optimization problems, where the objective function is an average of $n$ nonconvex functions. We propose a new stochastic gradient descent algorithm based on nested variance reduction. Compared with…
We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a…
This paper adapts a recently developed regularized stochastic version of the Broyden, Fletcher, Goldfarb, and Shanno (BFGS) quasi-Newton method for the solution of support vector machine classification problems. The proposed method is shown…
L-BFGS is the state-of-the-art optimization method for many large scale inverse problems. It has a small memory footprint and achieves superlinear convergence. The method approximates Hessian based on an initial approximation and an update…
This paper proposes a novel stochastic version of damped and regularized BFGS method for addressing the above problems.
We devise an L-BFGS method for optimization problems in which the objective is the sum of two functions, where the Hessian of the first function is computationally unavailable while the Hessian of the second function has a computationally…
Motivated by applications in optimization and machine learning, we consider stochastic quasi-Newton (SQN) methods for solving stochastic optimization problems. In the literature, the convergence analysis of these algorithms relies on strong…
Stochastic Gradient (SG) is the defacto iterative technique to solve stochastic optimization (SO) problems with a smooth (non-convex) objective $f$ and a stochastic first-order oracle. SG's attractiveness is due in part to its simplicity of…