Related papers: Efficiently Using Second Order Information in Larg…

Second-order optimization with lazy Hessians

We analyze Newton's method with lazy Hessian updates for solving general possibly non-convex optimization problems. We propose to reuse a previously seen Hessian for several iterations while computing new gradients at each step of the…

Optimization and Control · Mathematics 2023-06-16 Nikita Doikov , El Mahdi Chayti , Martin Jaggi

Second-order orthant-based methods with enriched Hessian information for sparse $\ell_1$-optimization

We present a second order algorithm, based on orthantwise directions, for solving optimization problems involving the sparsity enhancing $\ell_1$-norm. The main idea of our method consists in modifying the descent orthantwise directions by…

Optimization and Control · Mathematics 2016-07-05 J. C. De los Reyes , E. Loayza , P. Merino

Second-Order Methods with Cubic Regularization Under Inexact Information

In this paper, we generalize (accelerated) Newton's method with cubic regularization under inexact second-order information for (strongly) convex optimization problems. Under mild assumptions, we provide global rate of convergence of these…

Optimization and Control · Mathematics 2017-10-17 Saeed Ghadimi , Han Liu , Tong Zhang

A Distributed Second-Order Algorithm You Can Trust

Due to the rapid growth of data and computational resources, distributed optimization has become an active research area in recent years. While first-order methods seem to dominate the field, second-order methods are nevertheless attractive…

Machine Learning · Computer Science 2018-06-21 Celestine Dünner , Aurelien Lucchi , Matilde Gargiani , An Bian , Thomas Hofmann , Martin Jaggi

An augmented Lagrangian method exploiting an active-set strategy and second-order information

In this paper, we consider nonlinear optimization problems with nonlinear equality constraints and bound constraints on the variables. For the solution of such problems, many augmented Lagrangian methods have been defined in the literature.…

Optimization and Control · Mathematics 2022-01-12 Andrea Cristofari , Gianni Di Pillo , Giampaolo Liuzzi , Stefano Lucidi

Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function

Trust-region (TR) and adaptive regularization using cubics (ARC) have proven to have some very appealing theoretical properties for non-convex optimization by concurrently computing function value, gradient, and Hessian matrix to obtain the…

Machine Learning · Computer Science 2023-10-19 Liu Liu , Xuanqing Liu , Cho-Jui Hsieh , Dacheng Tao

Low-Order Explicit Hessian Imitation Method for Large-Scale Supervised Machine Learning

An algorithm is proposed for solving optimization problems arising in neural network training for supervised learning. The unique feature of the algorithm is the use of an auxiliary loss, in addition to the original loss employed for model…

Optimization and Control · Mathematics 2026-05-11 Yunlang Zhu , Lingjun Guo , Zahra Khatti , Xiaoyi Qu , Chia-Yuan Wu , Lara Zebiane , Frank E. Curtis

An hybrid stochastic Newton algorithm for logistic regression

In this paper, we investigate a second-order stochastic algorithm for solving large-scale binary classification problems. We propose to make use of a new hybrid stochastic Newton algorithm that includes two weighted components in the…

Computation · Statistics 2025-12-02 Bernard Bercu , Luis Fredes , Eméric Gbaguidi

Second-Order Stochastic Optimization for Machine Learning in Linear Time

First-order stochastic methods are the state-of-the-art in large-scale machine learning optimization owing to efficient per-iteration complexity. Second-order methods, while able to provide faster convergence, have been much less explored…

Machine Learning · Statistics 2017-12-01 Naman Agarwal , Brian Bullins , Elad Hazan

On the Acceleration of L-BFGS with Second-Order Information and Stochastic Batches

This paper proposes a framework of L-BFGS based on the (approximate) second-order information with stochastic batches, as a novel approach to the finite-sum minimization problems. Different from the classical L-BFGS where stochastic batches…

Machine Learning · Computer Science 2018-07-17 Jie Liu , Yu Rong , Martin Takac , Junzhou Huang

A multilevel framework for sparse optimization with application to inverse covariance estimation and logistic regression

Solving l1 regularized optimization problems is common in the fields of computational biology, signal processing and machine learning. Such l1 regularization is utilized to find sparse minimizers of convex functions. A well-known example is…

Numerical Analysis · Computer Science 2016-07-04 Eran Treister , Javier S. Turek , Irad Yavneh

A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization

In this work we describe an Adaptive Regularization using Cubics (ARC) method for large-scale nonconvex unconstrained optimization using Limited-memory Quasi-Newton (LQN) matrices. ARC methods are a relatively new family of optimization…

Optimization and Control · Mathematics 2022-04-21 Jarad Forristal , Joshua Griffin , Wenwen Zhou , Seyedalireza Yektamaram

Stochastic quasi-Newton with line-search regularization

In this paper we present a novel quasi-Newton algorithm for use in stochastic optimisation. Quasi-Newton methods have had an enormous impact on deterministic optimisation problems because they afford rapid convergence and computationally…

Systems and Control · Electrical Eng. & Systems 2019-09-04 Adrian Wills , Thomas Schön

Iterative Hessian Sketch in Input Sparsity Time

Scalable algorithms to solve optimization and regression tasks even approximately, are needed to work with large datasets. In this paper we study efficient techniques from matrix sketching to solve a variety of convex constrained regression…

Machine Learning · Computer Science 2019-11-01 Graham Cormode , Charlie Dickens

Sub-Sampled Newton Methods I: Globally Convergent Algorithms

Large scale optimization problems are ubiquitous in machine learning and data analysis and there is a plethora of algorithms for solving such problems. Many of these algorithms employ sub-sampling, as a way to either speed up the…

Optimization and Control · Mathematics 2016-02-29 Farbod Roosta-Khorasani , Michael W. Mahoney

Stochastic Second-Order Optimization via von Neumann Series

A stochastic iterative algorithm approximating second-order information using von Neumann series is discussed. We present convergence guarantees for strongly-convex and smooth functions. Our analysis is much simpler in contrast to a similar…

Optimization and Control · Mathematics 2017-04-14 Mojmir Mutny

Distributed Optimization Algorithm with Superlinear Convergence Rate

This paper considers distributed optimization problems, where each agent cooperatively minimizes the sum of local objective functions through the communication with its neighbors. The widely adopted distributed gradient method in solving…

Optimization and Control · Mathematics 2025-08-19 Yeming Xu , Ziyuan Guo , Kaihong Lu , Huanshui Zhang

Scaled minimax optimality in high-dimensional linear regression: A non-convex algorithmic regularization approach

The question of fast convergence in the classical problem of high dimensional linear regression has been extensively studied. Arguably, one of the fastest procedures in practice is Iterative Hard Thresholding (IHT). Still, IHT relies…

Statistics Theory · Mathematics 2020-08-28 Mohamed Ndaoud

Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning

Continual learning aims to learn a series of tasks sequentially without forgetting the knowledge acquired from the previous ones. In this work, we propose the Hessian Aware Low-Rank Perturbation algorithm for continual learning. By modeling…

Machine Learning · Computer Science 2024-09-24 Jiaqi Li , Yuanhao Lai , Rui Wang , Changjian Shui , Sabyasachi Sahoo , Charles X. Ling , Shichun Yang , Boyu Wang , Christian Gagné , Fan Zhou

Practical Inexact Proximal Quasi-Newton Method with Global Complexity Analysis

Recently several methods were proposed for sparse optimization which make careful use of second-order information [10, 28, 16, 3] to improve local convergence rates. These methods construct a composite quadratic approximation using Hessian…

Machine Learning · Computer Science 2015-07-15 Katya Scheinberg , Xiaocheng Tang