Related papers: Learning to be Global Optimizer
We propose a new gradient descent algorithm with added stochastic terms for finding the global optimizers of nonconvex optimization problems. A key component in the algorithm is the adaptive tuning of the randomness based on the value of…
Learning to optimize - the idea that we can learn from data algorithms that optimize a numerical criterion - has recently been at the heart of a growing number of research efforts. One of the most challenging issues within this approach is…
The graduated optimization approach, also known as the continuation method, is a popular heuristic to solving non-convex problems that has received renewed interest over the last decade. Despite its popularity, very little is known in terms…
There is a recent surge of interest in nonconvex reformulations via low-rank factorization for stochastic convex semidefinite optimization problem in the purpose of efficiency and scalability. Compared with the original convex formulations,…
The key task of machine learning is to minimize the loss function that measures the model fit to the training data. The numerical methods to do this efficiently depend on the properties of the loss function. The most decisive among these…
Recently, a new class of non-convex optimization problems motivated by the statistical problem of learning an acyclic directed graphical model from data has attracted significant interest. While existing work uses standard first-order…
This paper proposes a stochastic gradient descent method with an adaptive Gaussian noise term for the global minimization of nearly convex functions, which are nonconvex and possess multiple strict local minimizers. The noise term,…
Learning to optimize is an approach that leverages training data to accelerate the solution of optimization problems. Many approaches use unrolling to parametrize the update step and learn optimal parameters. Although L2O has shown…
Graduated optimization is a global optimization technique that is used to minimize a multimodal nonconvex function by smoothing the objective function with noise and gradually refining the solution. This paper experimentally evaluates the…
Training neural networks requires optimizing a loss function that may be highly irregular, and in particular neither convex nor smooth. Popular training algorithms are based on stochastic gradient descent with momentum (SGDM), for which…
Nonconvex optimization problems arise in many areas of computational science and engineering and are (approximately) solved by a variety of algorithms. Existing algorithms usually only have local convergence or subsequence convergence of…
We propose an algorithm for optimizing the parameters of single hidden layer neural networks. Specifically, we derive a blockwise difference-of-convex (DC) functions representation of the objective function. Based on the latter, we propose…
We consider the problem of maximizing an unknown function over a compact and convex set using as few observations as possible. We observe that the optimization of the function essentially relies on learning the induced bipartite ranking…
The increasing reliance on numerical methods for controlling dynamical systems and training machine learning models underscores the need to devise algorithms that dependably and efficiently navigate complex optimization landscapes.…
Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space. Instead, these methods use supervised…
Global minimization is a fundamental challenge in optimization, especially in machine learning, where finding the global minimum of a function directly impacts model performance and convergence. This article introduces a novel optimization…
Motivated by recent increased interest in optimization algorithms for non-convex optimization in application to training deep neural networks and other optimization problems in data analysis, we give an overview of recent theoretical…
Differentially private (stochastic) gradient descent is the workhorse of DP private machine learning in both the convex and non-convex settings. Without privacy constraints, second-order methods, like Newton's method, converge faster than…
The success of deep neural networks hinges on our ability to accurately and efficiently optimize high-dimensional, non-convex functions. In this paper, we empirically investigate the loss functions of state-of-the-art networks, and how…
We propose a new globalization strategy that can be used in unconstrained optimization algorithms to support rapid convergence from remote starting points. Our approach is based on using multiple points at each iteration to build a…