English
Related papers

Related papers: Exact Stochastic Second Order Deep Learning

200 papers

While the superior performance of second-order optimization methods such as Newton's method is well known, they are hardly used in practice for deep learning because neither assembling the Hessian matrix nor calculating its inverse is…

Machine Learning · Computer Science 2020-09-16 Siyuan Shen , Tianjia Shao , Kun Zhou , Chenfanfu Jiang , Feng Luo , Yin Yang

Trust region and cubic regularization methods have demonstrated good performance in small scale non-convex optimization, showing the ability to escape from saddle points. Each iteration of these methods involves computation of gradient,…

Optimization and Control · Mathematics 2018-09-27 Liu Liu , Xuanqing Liu , Cho-Jui Hsieh , Dacheng Tao

While first-order optimization methods such as stochastic gradient descent (SGD) are popular in machine learning (ML), they come with well-known deficiencies, including relatively-slow convergence, sensitivity to the settings of…

Optimization and Control · Mathematics 2018-02-19 Peng Xu , Farbod Roosta-Khorasani , Michael W. Mahoney

In this paper, we generalize (accelerated) Newton's method with cubic regularization under inexact second-order information for (strongly) convex optimization problems. Under mild assumptions, we provide global rate of convergence of these…

Optimization and Control · Mathematics 2017-10-17 Saeed Ghadimi , Han Liu , Tong Zhang

Second-order optimization methods are among the most widely used optimization approaches for convex optimization problems, and have recently been used to optimize non-convex optimization problems such as deep learning models. The widely…

Optimization and Control · Mathematics 2022-02-01 Dinesh Singh , Hardik Tankaria , Makoto Yamada

We analyze Newton's method with lazy Hessian updates for solving general possibly non-convex optimization problems. We propose to reuse a previously seen Hessian for several iterations while computing new gradients at each step of the…

Optimization and Control · Mathematics 2023-06-16 Nikita Doikov , El Mahdi Chayti , Martin Jaggi

We here adapt an extended version of the adaptive cubic regularisation method with dynamic inexact Hessian information for nonconvex optimisation in [3] to the stochastic optimisation setting. While exact function evaluations are still…

Numerical Analysis · Mathematics 2020-09-15 Stefania Bellavia , Gianmarco Gurioli

Despite their popularity in the field of continuous optimisation, second-order quasi-Newton methods are challenging to apply in machine learning, as the Hessian matrix is intractably large. This computational burden is exacerbated by the…

Machine Learning · Computer Science 2024-02-28 Elre T. Oldewage , Ross M. Clarke , José Miguel Hernández-Lobato

In this paper, we propose a distributed second- order method for reinforcement learning. Our approach is the fastest in literature so-far as it outperforms state-of-the-art methods, including ADMM, by significant margins. We achieve this by…

Optimization and Control · Mathematics 2016-08-08 Rasul Tutunov , Haitham Bou-Ammar , Ali Jadbabaie

In this paper, we study stochastic non-convex optimization with non-convex random functions. Recent studies on non-convex optimization revolve around establishing second-order convergence, i.e., converging to a nearly second-order optimal…

Optimization and Control · Mathematics 2017-11-02 Mingrui Liu , Tianbao Yang

First-order stochastic methods are the state-of-the-art in large-scale machine learning optimization owing to efficient per-iteration complexity. Second-order methods, while able to provide faster convergence, have been much less explored…

Machine Learning · Statistics 2017-12-01 Naman Agarwal , Brian Bullins , Elad Hazan

Differentially private (stochastic) gradient descent is the workhorse of DP private machine learning in both the convex and non-convex settings. Without privacy constraints, second-order methods, like Newton's method, converge faster than…

Machine Learning · Computer Science 2023-05-23 Arun Ganesh , Mahdi Haghifam , Thomas Steinke , Abhradeep Thakurta

We extend the standard notion of self-concordance to non-convex optimization and develop a family of second-order algorithms with global convergence guarantees. In particular, two function classes -- \textit{weakly self-concordant}…

Optimization and Control · Mathematics 2026-04-07 Donald Goldfarb , Lexiao Lai , Tianyi Lin , Jiayu Zhang

An algorithm is proposed for solving optimization problems arising in neural network training for supervised learning. The unique feature of the algorithm is the use of an auxiliary loss, in addition to the original loss employed for model…

Optimization and Control · Mathematics 2026-05-11 Yunlang Zhu , Lingjun Guo , Zahra Khatti , Xiaoyi Qu , Chia-Yuan Wu , Lara Zebiane , Frank E. Curtis

Rapid advances in data collection and processing capabilities have allowed for the use of increasingly complex models that give rise to nonconvex optimization problems. These formulations, however, can be arbitrarily difficult to solve in…

Multiagent Systems · Computer Science 2020-04-01 Stefan Vlaski , Ali H. Sayed

In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic…

Optimization and Control · Mathematics 2021-07-01 El-houcine Bergou , Youssef Diouane , Vladimir Kunc , Vyacheslav Kungurtsev , Clément W. Royer

Second-order methods are emerging as promising alternatives to standard first-order optimizers such as gradient descent and ADAM for training neural networks. Though the advantages of including curvature information in computing…

Machine Learning · Computer Science 2025-10-15 Conor Rowan

Newton's method is the most widespread high-order method, demanding the gradient and the Hessian of the objective function. However, one of the main disadvantages of Newtons method is its lack of global convergence and high iteration cost.…

Optimization plays a key role in machine learning. Recently, stochastic second-order methods have attracted much attention due to their low computational cost in each iteration. However, these algorithms might perform poorly especially if…

Machine Learning · Computer Science 2017-10-25 Haishan Ye , Zhihua Zhang

Second-order methods for neural network optimization have several advantages over methods based on first-order gradient descent, including better scaling to large mini-batch sizes and fewer updates needed for convergence. But they are…

Machine Learning · Computer Science 2017-12-21 Huishuai Zhang , Caiming Xiong , James Bradbury , Richard Socher
‹ Prev 1 2 3 10 Next ›