Related papers: Second-Order Forward-Mode Automatic Differentiatio…

Fast Second-Order Stochastic Backpropagation for Variational Inference

We propose a second-order (Hessian or Hessian-free) based optimization method for variational inference inspired by Gaussian backpropagation, and argue that quasi-Newton optimization can be developed as well. This is accomplished by…

Machine Learning · Statistics 2017-03-30 Kai Fan , Ziteng Wang , Jeff Beck , James Kwok , Katherine Heller

Exact Stochastic Second Order Deep Learning

Optimization in Deep Learning is mainly dominated by first-order methods which are built around the central concept of backpropagation. Second-order optimization methods, which take into account the second-order derivatives are far less…

Machine Learning · Computer Science 2021-04-09 Fares B. Mehouachi , Chaouki Kasmi

Gathering and Exploiting Higher-Order Information when Training Large Structured Models

When training large models, such as neural networks, the full derivatives of order 2 and beyond are usually inaccessible, due to their computational cost. Therefore, among the second-order optimization methods, it is common to bypass the…

Machine Learning · Computer Science 2025-10-01 Pierre Wolinski

A Subsampling Line-Search Method with Second-Order Results

In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic…

Optimization and Control · Mathematics 2021-07-01 El-houcine Bergou , Youssef Diouane , Vladimir Kunc , Vyacheslav Kungurtsev , Clément W. Royer

DOF: Accelerating High-order Differential Operators with Forward Propagation

Solving partial differential equations (PDEs) efficiently is essential for analyzing complex physical systems. Recent advancements in leveraging deep learning for solving PDE have shown significant promise. However, machine learning…

Machine Learning · Computer Science 2024-02-16 Ruichen Li , Chuwei Wang , Haotian Ye , Di He , Liwei Wang

Low-Order Explicit Hessian Imitation Method for Large-Scale Supervised Machine Learning

An algorithm is proposed for solving optimization problems arising in neural network training for supervised learning. The unique feature of the algorithm is the use of an auxiliary loss, in addition to the original loss employed for model…

Optimization and Control · Mathematics 2026-05-11 Yunlang Zhu , Lingjun Guo , Zahra Khatti , Xiaoyi Qu , Chia-Yuan Wu , Lara Zebiane , Frank E. Curtis

Second-Order Stochastic Optimization for Machine Learning in Linear Time

First-order stochastic methods are the state-of-the-art in large-scale machine learning optimization owing to efficient per-iteration complexity. Second-order methods, while able to provide faster convergence, have been much less explored…

Machine Learning · Statistics 2017-12-01 Naman Agarwal , Brian Bullins , Elad Hazan

Gradients without Backpropagation

Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic…

Machine Learning · Computer Science 2022-02-18 Atılım Güneş Baydin , Barak A. Pearlmutter , Don Syme , Frank Wood , Philip Torr

Faster Differentially Private Convex Optimization via Second-Order Methods

Differentially private (stochastic) gradient descent is the workhorse of DP private machine learning in both the convex and non-convex settings. Without privacy constraints, second-order methods, like Newton's method, converge faster than…

Machine Learning · Computer Science 2023-05-23 Arun Ganesh , Mahdi Haghifam , Thomas Steinke , Abhradeep Thakurta

BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach

Bilevel optimization (BO) is useful for solving a variety of important machine learning problems including but not limited to hyperparameter optimization, meta-learning, continual learning, and reinforcement learning. Conventional BO…

Machine Learning · Computer Science 2022-09-20 Mao Ye , Bo Liu , Stephen Wright , Peter Stone , Qiang Liu

AdaSub: Stochastic Optimization Using Second-Order Information in Low-Dimensional Subspaces

We introduce AdaSub, a stochastic optimization algorithm that computes a search direction based on second-order information in a low-dimensional subspace that is defined adaptively based on available current and past information. Compared…

Optimization and Control · Mathematics 2023-11-08 João Victor Galvão da Mata , Martin S. Andersen

HesScale: Scalable Computation of Hessian Diagonals

Second-order optimization uses curvature information about the objective function, which can help in faster convergence. However, such methods typically require expensive computation of the Hessian matrix, preventing their usage in a…

Machine Learning · Computer Science 2022-11-03 Mohamed Elsayed , A. Rupam Mahmood

A High-order Backpropagation Algorithm for Neural Stochastic Differential Equation Model

Neural stochastic differential equation model with a Brownian motion term can capture epistemic uncertainty of deep neural network from the perspective of a dynamical system. The goal of this paper is to improve the convergence rate of the…

Numerical Analysis · Mathematics 2025-09-09 Daili Sheng , Minghui Song , Xiang Peng , Xuanqi Dong

New Methods for Parametric Optimization via Differential Equations

We develop and analyze several different second-order algorithms for computing a near-optimal solution path of a convex parametric optimization problem with smooth Hessian. Our algorithms are inspired by a differential equation perspective…

Optimization and Control · Mathematics 2023-06-16 Heyuan Liu , Paul Grigas

A Quasi-Newton Primal-Dual Algorithm with Line Search

Quasi-Newton methods refer to a class of algorithms at the interface between first and second order methods. They aim to progress as substantially as second order methods per iteration, while maintaining the computational complexity of…

Optimization and Control · Mathematics 2024-05-14 Shida Wang , Jalal Fadili , Peter Ochs

Second-Order Sensitivity Analysis for Bilevel Optimization

In this work we derive a second-order approach to bilevel optimization, a type of mathematical programming in which the solution to a parameterized optimization problem (the "lower" problem) is itself to be optimized (in the "upper"…

Optimization and Control · Mathematics 2022-05-06 Robert Dyro , Edward Schmerling , Nikos Arechiga , Marco Pavone

Faster AutoAugment: Learning Augmentation Strategies using Backpropagation

Data augmentation methods are indispensable heuristics to boost the performance of deep neural networks, especially in image recognition tasks. Recently, several studies have shown that augmentation strategies found by search algorithms…

Computer Vision and Pattern Recognition · Computer Science 2019-11-19 Ryuichiro Hataya , Jan Zdenek , Kazuki Yoshizoe , Hideki Nakayama

Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information

We present a novel adaptive optimization algorithm for large-scale machine learning problems. Equipped with a low-cost estimate of local curvature and Lipschitz smoothness, our method dynamically adapts the search direction and step-size.…

Machine Learning · Computer Science 2021-09-14 Majid Jahani , Sergey Rusakov , Zheng Shi , Peter Richtárik , Michael W. Mahoney , Martin Takáč

On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width

Second-order optimization has been developed to accelerate the training of deep neural networks and it is being applied to increasingly larger-scale models. In this study, towards training on further larger scales, we identify a specific…

Machine Learning · Computer Science 2024-06-11 Satoki Ishikawa , Ryo Karakida

Highly Efficient Hierarchical Online Nonlinear Regression Using Second Order Methods

We introduce highly efficient online nonlinear regression algorithms that are suitable for real life applications. We process the data in a truly online manner such that no storage is needed, i.e., the data is discarded after being used.…

Machine Learning · Computer Science 2017-01-19 Burak C. Civek , Ibrahim Delibalta , Suleyman S. Kozat