English
Related papers

Related papers: Last-iterate convergence rates for min-max optimiz…

200 papers

The stochastic gradient descent has been widely used for solving composite optimization problems in big data analyses. Many algorithms and convergence properties have been developed. The composite functions were convex primarily and…

Machine Learning · Statistics 2020-03-03 Takayuki Kawashima , Hironori Fujisawa

Many recent applications in machine learning and data fitting call for the algorithmic solution of structured smooth convex optimization problems. Although the gradient descent method is a natural choice for this task, it requires exact…

Optimization and Control · Mathematics 2013-09-03 Anthony Man-Cho So

Motivated by recent increased interest in optimization algorithms for non-convex optimization in application to training deep neural networks and other optimization problems in data analysis, we give an overview of recent theoretical…

In this paper, we consider a general stochastic optimization problem which is often at the core of supervised learning, such as deep learning and linear classification. We consider a standard stochastic gradient descent (SGD) method with a…

Machine Learning · Statistics 2018-12-27 Lam M. Nguyen , Nam H. Nguyen , Dzung T. Phan , Jayant R. Kalagnanam , Katya Scheinberg

Equilibrium computation on Riemannian manifolds provides a unifying framework for numerous problems in machine learning and data analytics. One of the simplest yet most fundamental methods is Riemannian gradient descent (RGD). While its…

Optimization and Control · Mathematics 2025-11-11 Yang Cai , Michael I. Jordan , Tianyi Lin , Argyris Oikonomou , Emmanouil-Vasileios Vlatakis-Gkaragkounis

In this paper, we consider first-order convergence theory and algorithms for solving a class of non-convex non-concave min-max saddle-point problems, whose objective function is weakly convex in the variables of minimization and weakly…

Optimization and Control · Mathematics 2021-07-08 Mingrui Liu , Hassan Rafique , Qihang Lin , Tianbao Yang

We consider regression problems with binary weights. Such optimization problems are ubiquitous in quantized learning models and digital communication systems. A natural approach is to optimize the corresponding Lagrangian using variants of…

Machine Learning · Computer Science 2020-12-01 Nisan Chiprut , Amir Globerson , Ami Wiesel

Gradient clipping is commonly used in training deep neural networks partly due to its practicability in relieving the exploding gradient problem. Recently, \citet{zhang2019gradient} show that clipped (stochastic) Gradient Descent (GD)…

Machine Learning · Computer Science 2020-10-30 Bohang Zhang , Jikai Jin , Cong Fang , Liwei Wang

Many large-scale constrained optimization problems can be formulated as bilevel distributed optimization tasks over undirected networks, where agents collaborate to minimize a global cost function while adhering to constraints, relying only…

Optimization and Control · Mathematics 2025-11-25 Ajay Tak , Mayank Baranwal

We prove novel convergence results for a stochastic proximal gradient algorithm suitable for solving a large class of convex optimization problems, where a convex objective function is given by the sum of a smooth and a possibly non-smooth…

Optimization and Control · Mathematics 2016-08-11 Lorenzo Rosasco , Silvia Villa , Bang Công Vũ

We formulate two classes of first-order algorithms more general than previously studied for minimizing smooth and strongly convex or, respectively, smooth and convex functions. We establish sufficient conditions, via new discrete Lyapunov…

Optimization and Control · Mathematics 2023-04-21 Penghui Fu , Zhiqiang Tan

We study the convergence rate for the last iterate of stochastic gradient descent (SGD) and stochastic heavy ball (SHB) in the parametric setting when the objective function $F$ is globally convex or non-convex whose gradient is…

Optimization and Control · Mathematics 2026-03-11 Marcel Hudiani

In this paper, we study the implicit regularization of the gradient descent algorithm in homogeneous neural networks, including fully-connected and convolutional neural networks with ReLU or LeakyReLU activations. In particular, we study…

Machine Learning · Computer Science 2021-01-01 Kaifeng Lyu , Jian Li

The convergence of stochastic gradient descent is highly dependent on the step-size, especially on non-convex problems such as neural network training. Step decay step-size schedules (constant and then cut) are widely used in practice…

Optimization and Control · Mathematics 2021-02-19 Xiaoyu Wang , Sindri Magnússon , Mikael Johansson

We consider a general class of regression models with normally distributed covariates, and the associated nonconvex problem of fitting these models from data. We develop a general recipe for analyzing the convergence of iterative algorithms…

Optimization and Control · Mathematics 2021-09-22 Kabir Aladin Chandrasekher , Ashwin Pananjady , Christos Thrampoulidis

The optimization algorithms are crucial in training physics-informed neural networks (PINNs), as unsuitable methods may lead to poor solutions. Compared to the common gradient descent (GD) algorithm, implicit gradient descent (IGD)…

Machine Learning · Computer Science 2025-08-04 Xianliang Xu , Ting Du , Wang Kong , Bin Shan , Ye Li , Zhongyi Huang

Accelerating the convergence of second-order optimization, particularly Newton-type methods, remains a pivotal challenge in algorithmic research. In this paper, we extend previous work on the \textbf{Quadratic Gradient (QG)} and rigorously…

Optimization and Control · Mathematics 2026-04-01 John Chiang

The classical convergence analysis of SGD is carried out under the assumption that the norm of the stochastic gradient is uniformly bounded. While this might hold for some loss functions, it is violated for cases where the objective…

Optimization and Control · Mathematics 2019-11-12 Lam M. Nguyen , Phuong Ha Nguyen , Peter Richtárik , Katya Scheinberg , Martin Takáč , Marten van Dijk

A variant of consensus based distributed gradient descent (\textbf{DGD}) is studied for finite sums of smooth but possibly non-convex functions. In particular, the local gradient term in the fixed step-size iteration of each agent is…

Optimization and Control · Mathematics 2026-05-27 Lei Qin , Michael Cantoni , Ye Pu

We study a hybrid conditional gradient - smoothing algorithm (HCGS) for solving composite convex optimization problems which contain several terms over a bounded set. Examples of these include regularization problems with several norms as…

Optimization and Control · Mathematics 2014-04-16 Andreas Argyriou , Marco Signoretto , Johan Suykens
‹ Prev 1 8 9 10 Next ›