Related papers: Dictionary descent in optimization
Greedy algorithms have been successfully analyzed and applied in training neural networks for solving variational problems, ensuring guaranteed convergence orders. In this paper, we extend the analysis of the orthogonal greedy algorithm…
Randomly initialized first-order optimization algorithms are the method of choice for solving many high-dimensional nonconvex problems in machine learning, yet general theoretical guarantees cannot rule out convergence to critical points of…
This paper considers the problem of online optimization where the objective function is time-varying. In particular, we extend coordinate descent type algorithms to the online case, where the objective function varies after a finite number…
We consider derivative-free algorithms for stochastic and non-stochastic convex optimization problems that use only function values rather than gradients. Focusing on non-asymptotic bounds on convergence rates, we show that if pairs of…
The paper gives a systematic study of the approximate versions of three greedy-type algorithms that are widely used in convex optimization. By approximate version we mean the one where some of evaluations are made with an error. Importance…
Greedy algorithms which use only function evaluations are applied to convex optimization in a general Banach space $X$. Along with algorithms that use exact evaluations, algorithms with approximate evaluations are treated. A priori upper…
Submodular function minimization is a fundamental optimization problem that arises in several applications in machine learning and computer vision. The problem is known to be solvable in polynomial time, but general purpose algorithms have…
Coordinate descent algorithms solve optimization problems by successively performing approximate minimization along coordinate directions or coordinate hyperplanes. They have been used in applications for many years, and their popularity…
Prediction-correction algorithms are a highly effective class of methods for solving pseudo-convex optimization problems. The descent direction of these algorithms can be viewed as an adjustment to the gradient direction based on the…
In this paper we analyze several new methods for solving nonconvex optimization problems with the objective function formed as a sum of two terms: one is nonconvex and smooth, and another is convex but simple and its structure is known.…
We propose a new gradient descent algorithm with added stochastic terms for finding the global optimizers of nonconvex optimization problems. A key component in the algorithm is the adaptive tuning of the randomness based on the value of…
The success of deep neural networks hinges on our ability to accurately and efficiently optimize high-dimensional, non-convex functions. In this paper, we empirically investigate the loss functions of state-of-the-art networks, and how…
We consider a class of combinatorial optimization problems that emerge in a variety of domains among which: condensed matter physics, theory of financial risks, error correcting codes in information transmissions, molecular and protein…
In this paper we propose a variant of the random coordinate descent method for solving linearly constrained convex optimization problems with composite objective functions. If the smooth part of the objective function has Lipschitz…
We consider the problem of optimizing a high-dimensional convex function using stochastic zeroth-order queries. Under sparsity assumptions on the gradients or function values, we present two algorithms: a successive component/feature…
We investigate different randomizations for mirror descent method. We try to propose such a randomization that allows us to use sparsity of the problem as much as it possible. In the paper one can also find a generalization of randomizaed…
This paper develops a general theory for first-order descent methods whose search directions are restricted to a prescribed dictionary in a reflexive Banach space. Instead of assuming that the linear span of the dictionary is dense, as in…
This paper considers the analysis of continuous time gradient-based optimization algorithms through the lens of nonlinear contraction theory. It demonstrates that in the case of a time-invariant objective, most elementary results on…
We analyze the orthogonal greedy algorithm when applied to dictionaries $\mathbb{D}$ whose convex hull has small entropy. We show that if the metric entropy of the convex hull of $\mathbb{D}$ decays at a rate of $O(n^{-\frac{1}{2}-\alpha})$…
In this paper, we study randomized and cyclic coordinate descent for convex unconstrained optimization problems. We improve the known convergence rates in some cases by using the numerical semidefinite programming performance estimation…