Related papers: Distributed Averaging Methods for Randomized Secon…

Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds

We consider distributed optimization methods for problems where forming the Hessian is computationally challenging and communication is a significant bottleneck. We leverage randomized sketches for reducing the problem dimensions as well as…

Optimization and Control · Mathematics 2022-03-21 Burak Bartan , Mert Pilanci

Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization

In distributed second order optimization, a standard strategy is to average many local estimates, each of which is based on a small sketch or batch of the data. However, the local estimates on each machine are typically biased, relative to…

Machine Learning · Computer Science 2020-07-06 Michał Dereziński , Burak Bartan , Mert Pilanci , Michael W. Mahoney

Optimal Shrinkage for Distributed Second-Order Optimization

In this work, we address the problem of Hessian inversion bias in distributed second-order optimization algorithms. We introduce a novel shrinkage-based estimator for the resolvent of gram matrices which is asymptotically unbiased, and…

Optimization and Control · Mathematics 2024-02-06 Fangzhao Zhang , Mert Pilanci

Learning Linear Models Using Distributed Iterative Hessian Sketching

This work considers the problem of learning the Markov parameters of a linear system from observed data. Recent non-asymptotic system identification results have characterized the sample complexity of this problem in the single and…

Optimization and Control · Mathematics 2021-12-09 Han Wang , James Anderson

Distributed Optimization Algorithm with Superlinear Convergence Rate

This paper considers distributed optimization problems, where each agent cooperatively minimizes the sum of local objective functions through the communication with its neighbors. The widely adopted distributed gradient method in solving…

Optimization and Control · Mathematics 2025-08-19 Yeming Xu , Ziyuan Guo , Kaihong Lu , Huanshui Zhang

Sub-Sampled Newton Methods I: Globally Convergent Algorithms

Large scale optimization problems are ubiquitous in machine learning and data analysis and there is a plethora of algorithms for solving such problems. Many of these algorithms employ sub-sampling, as a way to either speed up the…

Optimization and Control · Mathematics 2016-02-29 Farbod Roosta-Khorasani , Michael W. Mahoney

A Distributed Continuous-time Modified Newton-Raphson Algorithm

We propose a continuous-time second-order optimization algorithm for solving unconstrained convex optimization problems with bounded Hessian. We show that this alternative algorithm has a comparable convergence rate to that of the…

Optimization and Control · Mathematics 2021-05-21 Hossein Moradian , Solmaz S. Kia

A Distributed Second-Order Algorithm You Can Trust

Due to the rapid growth of data and computational resources, distributed optimization has become an active research area in recent years. While first-order methods seem to dominate the field, second-order methods are nevertheless attractive…

Machine Learning · Computer Science 2018-06-21 Celestine Dünner , Aurelien Lucchi , Matilde Gargiani , An Bian , Thomas Hofmann , Martin Jaggi

A Distributed Quasi-Newton Algorithm for Primal and Dual Regularized Empirical Risk Minimization

We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving empirical risk minimization (ERM) problems with a nonsmooth regularization term. Our algorithm is applicable…

Machine Learning · Computer Science 2019-12-16 Ching-pei Lee , Cong Han Lim , Stephen J. Wright

Distributed Cross-Layer Optimization in Wireless Networks: A Second-Order Approach

Due to the rapidly growing scale and heterogeneity of wireless networks, the design of distributed cross-layer optimization algorithms have received significant interest from the networking research community. So far, the standard…

Networking and Internet Architecture · Computer Science 2016-11-18 Jia Liu , Cathy H. Xia , Ness B. Shroff , Hanif D. Sherali

Unbiased estimation of second-order parameter sensitivities for stochastic reaction networks

This paper deals with the problem of estimating second-order parameter sensitivities for stochastic reaction networks, where the reaction dynamics is modeled as a continuous time Markov chain over a discrete state space. Estimation of such…

Probability · Mathematics 2014-07-29 Ankit Gupta , Mustafa Khammash

An Accelerated Second-Order Method for Distributed Stochastic Optimization

We consider distributed stochastic optimization problems that are solved with master/workers computation architecture. Statistical arguments allow to exploit statistical similarity and approximate this problem by a finite-sum problem, for…

Optimization and Control · Mathematics 2021-03-29 Artem Agafonov , Pavel Dvurechensky , Gesualdo Scutari , Alexander Gasnikov , Dmitry Kamzolov , Aleksandr Lukashevich , Amir Daneshmand

A Distributed Quasi-Newton Algorithm for Empirical Risk Minimization with Nonsmooth Regularization

We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving ERM problems with a nonsmooth regularization term. Current second-order and quasi-Newton methods for this…

Optimization and Control · Mathematics 2018-05-29 Ching-pei Lee , Cong Han Lim , Stephen J. Wright

Fast, Accurate Second Order Methods for Network Optimization

Dual descent methods are commonly used to solve network flow optimization problems, since their implementation can be distributed over the network. These algorithms, however, often exhibit slow convergence rates. Approximate Newton methods…

Optimization and Control · Mathematics 2015-03-25 Rasul Tutunov , Haitham Bou Ammar , Ali Jadbabaie

OverSketched Newton: Fast Convex Optimization for Serverless Systems

Motivated by recent developments in serverless systems for large-scale computation as well as improvements in scalable randomized matrix algorithms, we develop OverSketched Newton, a randomized Hessian-based optimization algorithm to solve…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-28 Vipul Gupta , Swanand Kadhe , Thomas Courtade , Michael W. Mahoney , Kannan Ramchandran

Distributed Sketching Methods for Privacy Preserving Regression

In this work, we study distributed sketching methods for large scale regression problems. We leverage multiple randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-23 Burak Bartan , Mert Pilanci

Distributed Least Squares in Small Space via Sketching and Bias Reduction

Matrix sketching is a powerful tool for reducing the size of large data matrices. Yet there are fundamental limitations to this size reduction when we want to recover an accurate estimator for a task such as least square regression. We show…

Data Structures and Algorithms · Computer Science 2024-05-10 Sachin Garg , Kevin Tan , Michał Dereziński

Distributed estimation of the inverse Hessian by determinantal averaging

In distributed optimization and distributed numerical linear algebra, we often encounter an inversion bias: if we want to compute a quantity that depends on the inverse of a sum of distributed matrices, then the sum of the inverses does not…

Machine Learning · Computer Science 2019-05-29 Michał Dereziński , Michael W. Mahoney

Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling

The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. It arises in various application domains,…

Optimization and Control · Mathematics 2015-03-17 John Duchi , Alekh Agarwal , Martin Wainwright

Stochastic Bound Majorization

Recently a majorization method for optimizing partition functions of log-linear models was proposed alongside a novel quadratic variational upper-bound. In the batch setting, it outperformed state-of-the-art first- and second-order…

Machine Learning · Computer Science 2013-09-24 Anna Choromanska , Tony Jebara