Related papers: Generalized Framework for Nonlinear Acceleration
Despite the impressive numerical performance of the quasi-Newton and Anderson/nonlinear acceleration methods, their global convergence rates have remained elusive for over 50 years. This study addresses this long-standing issue by…
Anderson acceleration (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t+1}=G(x_t)$, e.g., gradient descent can be viewed as iteratively applying the operation $G(x) \triangleq x-\alpha\nabla f(x)$. It…
We describe a convergence acceleration technique for unconstrained optimization problems. Our scheme computes estimates of the optimum from a nonlinear average of the iterates produced by any optimization method. The weights in this average…
In this paper, we present a generic framework to extend existing uniformly optimal convex programming algorithms to solve more general nonlinear, possibly nonconvex, optimization problems. The basic idea is to incorporate a local search…
This paper develops a new class of nonlinear acceleration algorithms based on extending conjugate residual-type procedures from linear to nonlinear equations. The main algorithm has strong similarities with Anderson acceleration as well as…
Regularized nonlinear acceleration (RNA) estimates the minimum of a function by post-processing iterates from an algorithm such as the gradient method. It can be seen as a regularized version of Anderson acceleration, a classical…
We consider nonlinear convergence acceleration methods for fixed-point iteration $x_{k+1}=q(x_k)$, including Anderson acceleration (AA), nonlinear GMRES (NGMRES), and Nesterov-type acceleration (corresponding to AA with window size one). We…
Incorporating second order curvature information in gradient based methods have shown to improve convergence drastically despite its computational intensity. In this paper, we propose a stochastic (online) quasi-Newton method with…
Optimization is important in machine learning problems, and quasi-Newton methods have a reputation as the most efficient numerical schemes for smooth unconstrained optimization. In this paper, we consider the explicit superlinear…
We consider the application of the type-I Anderson acceleration to solving general non-smooth fixed-point problems. By interleaving with safe-guarding steps, and employing a Powell-type regularization and a re-start checking for strong…
Quasar convexity is a condition that allows some first-order methods to efficiently minimize a function even when the optimization landscape is non-convex. Previous works develop near-optimal accelerated algorithms for minimizing this class…
In this paper, we study greedy variants of quasi-Newton methods. They are based on the updating formulas from a certain subclass of the Broyden family. In particular, this subclass includes the well-known DFP, BFGS and SR1 updates. However,…
Nonlinear acceleration methods are powerful techniques to speed up fixed-point iterations. However, many acceleration methods require storing a large number of previous iterates and this can become impractical if computational resources are…
The Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration method capable of improving the rate of convergence of many optimization schemes such as gradient descend, SAGA or SVRG. Until now, its analysis is limited to convex…
Graph Neural Networks (GNNs) have demonstrated effectiveness in various graph-based tasks. However, their inefficiency in training and inference presents challenges for scaling up to real-world and large-scale graph applications. To address…
Gradient restarting has been shown to improve the numerical performance of accelerated gradient methods. This paper provides a mathematical analysis to understand these advantages. First, we establish global linear convergence guarantees…
In this paper, we propose a novel Anderson's acceleration method to solve nonlinear equations, which does \emph{not} require a restart strategy to achieve numerical stability. We propose the greedy and random versions of our algorithm.…
We derive efficient algorithms to compute weakly Pareto optimal solutions for smooth, convex and unconstrained multiobjective optimization problems in general Hilbert spaces. To this end, we define a novel inertial gradient-like dynamical…
This paper examines a number of extrapolation and acceleration methods, and introduces a few modifications of the standard Shanks transformation that deal with general sequences. One of the goals of the paper is to lay out a general…
The method of nonlinear conjugate gradients (NCG) is widely used in practice for unconstrained optimization, but it satisfies weak complexity bounds at best when applied to smooth convex functions. In contrast, Nesterov's accelerated…