Related papers: Alternating Differentiation for Optimization Layer…

Efficient differentiable quadratic programming layers: an ADMM approach

Recent advances in neural-network architecture allow for seamless integration of convex optimization problems as differentiable layers in an end-to-end trainable neural network. Integrating medium and large scale quadratic programs into a…

Optimization and Control · Mathematics 2021-12-15 Andrew Butler , Roy Kwon

AltGDmin: Alternating GD and Minimization for Partly-Decoupled (Federated) Optimization

This article describes a novel optimization solution framework, called alternating gradient descent (GD) and minimization (AltGDmin), that is useful for many problems for which alternating minimization (AltMin) is a popular solution. AltMin…

Machine Learning · Computer Science 2025-04-22 Namrata Vaswani

Alternating Direction Method of Multipliers for Quantization

Quantization of the parameters of machine learning models, such as deep neural networks, requires solving constrained optimization problems, where the constraint set is formed by the Cartesian product of many simple discrete sets. For such…

Optimization and Control · Mathematics 2021-03-02 Tianjian Huang , Prajwal Singhania , Maziar Sanjabi , Pabitra Mitra , Meisam Razaviyayn

Revisiting Implicit Differentiation for Learning Problems in Optimal Control

This paper proposes a new method for differentiating through optimal trajectories arising from non-convex, constrained discrete-time optimal control (COC) problems using the implicit function theorem (IFT). Previous works solve a…

Machine Learning · Computer Science 2023-10-25 Ming Xu , Timothy Molloy , Stephen Gould

Efficient and Modular Implicit Differentiation

Automatic differentiation (autodiff) has revolutionized machine learning. It allows to express complex computations by composing elementary ones in creative ways and removes the burden of computing their derivatives by hand. More recently,…

Machine Learning · Computer Science 2022-10-13 Mathieu Blondel , Quentin Berthet , Marco Cuturi , Roy Frostig , Stephan Hoyer , Felipe Llinares-López , Fabian Pedregosa , Jean-Philippe Vert

Alternating Direction Method of Multipliers for nonlinear constrained convex problems and applications to distributed resource allocation and constrained machine learning

We study a class of structured convex optimization problems, which have a two-block separable objective and nonlinear functional constraints as well as affine constraints that couple the two block variables. Such problems naturally arise…

Optimization and Control · Mathematics 2026-02-27 Zhengjie Xiong , Yangyang Xu

Fundamental Benefit of Alternating Updates in Minimax Optimization

The Gradient Descent-Ascent (GDA) algorithm, designed to solve minimax optimization problems, takes the descent and ascent steps either simultaneously (Sim-GDA) or alternately (Alt-GDA). While Alt-GDA is commonly observed to converge…

Optimization and Control · Mathematics 2024-07-16 Jaewook Lee , Hanseul Cho , Chulhee Yun

A Convergent ADMM Framework for Efficient Neural Network Training

As a well-known optimization framework, the Alternating Direction Method of Multipliers (ADMM) has achieved tremendous success in many classification and regression applications. Recently, it has attracted the attention of deep learning…

Machine Learning · Computer Science 2021-12-23 Junxiang Wang , Hongyi Li , Liang Zhao

Convergent Proximal Multiblock ADMM for Nonconvex Dynamics-Constrained Optimization

This paper proposes a provably convergent multiblock ADMM for nonconvex optimization with nonlinear dynamics constraints, overcoming the divergence issue in classical extensions. We consider a class of optimization problems that arise from…

Optimization and Control · Mathematics 2025-06-24 Bowen Li , Ya-xiang Yuan

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but…

Machine Learning · Statistics 2022-08-10 Quentin Bertrand , Quentin Klopfenstein , Mathurin Massias , Mathieu Blondel , Samuel Vaiter , Alexandre Gramfort , Joseph Salmon

Alternating Updates for Efficient Transformers

It has been well established that increasing scale in deep transformer networks leads to improved quality and performance. However, this increase in scale often comes with prohibitive increases in compute cost and inference latency. We…

Machine Learning · Computer Science 2023-10-05 Cenk Baykal , Dylan Cutler , Nishanth Dikkala , Nikhil Ghosh , Rina Panigrahy , Xin Wang

Lightweight Real-Time ALADIN for Distributed Optimization

This paper presents a real-time computational framework for multi-node distributed optimization by extending the Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) algorithm. Our approach integrates adjoint sequential…

Optimization and Control · Mathematics 2026-04-17 Yifei Wang , Xuhui Feng , Shimin Pan , Liangfan Zhu , Xu Du , Apostolos I. Rikos

KKT Conditions, First-Order and Second-Order Optimization, and Distributed Optimization: Tutorial and Survey

This is a tutorial and survey paper on Karush-Kuhn-Tucker (KKT) conditions, first-order and second-order numerical optimization, and distributed optimization. After a brief review of history of optimization, we start with some preliminaries…

Optimization and Control · Mathematics 2021-10-06 Benyamin Ghojogh , Ali Ghodsi , Fakhri Karray , Mark Crowley

Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning

Alternating gradient-descent-ascent (AltGDA) is an optimization algorithm that has been widely used for model training in various machine learning applications, which aims to solve a nonconvex minimax optimization problem. However, the…

Machine Learning · Computer Science 2022-05-23 Ziyi Chen , Shaocong Ma , Yi Zhou

A two-level distributed algorithm for nonconvex constrained optimization

This paper aims to develop distributed algorithms for nonconvex optimization problems with complicated constraints associated with a network. The network can be a physical one, such as an electric power network, where the constraints are…

Optimization and Control · Mathematics 2022-11-21 Kaizhao Sun , X. Andy Sun

ADMM for Efficient Deep Learning with Global Convergence

Alternating Direction Method of Multipliers (ADMM) has been used successfully in many conventional machine learning applications and is considered to be a useful alternative to Stochastic Gradient Descent (SGD) as a deep learning optimizer.…

Optimization and Control · Mathematics 2021-07-07 Junxiang Wang , Fuxun Yu , Xiang Chen , Liang Zhao

Decomposition of non-convex optimization via bi-level distributed ALADIN

Decentralized optimization algorithms are important in different contexts, such as distributed optimal power flow or distributed model predictive control, as they avoid central coordination and enable decomposition of large-scale problems.…

Optimization and Control · Mathematics 2019-03-28 Alexander Engelmann , Yuning Jiang , Boris Houska , Timm Faulwasser

A Preconditioned Alternating Minimization Framework for Nonconvex and Half Quadratic Optimization

For some typical and widely used non-convex half-quadratic regularization models and the Ambrosio-Tortorelli approximate Mumford-Shah model, based on the Kurdyka-\L ojasiewicz analysis and the recent nonconvex proximal algorithms, we…

Optimization and Control · Mathematics 2021-07-30 Shengxiang Deng , Ismail Ben Ayed , Hongpeng Sun

Running Primal-Dual Gradient Method for Time-Varying Nonconvex Problems

This paper considers a nonconvex optimization problem that evolves over time, and addresses the synthesis and analysis of regularized primal-dual gradient methods to track a Karush-Kuhn-Tucker (KKT) trajectory. The proposed regularized…

Optimization and Control · Mathematics 2018-12-04 Yujie Tang , Emiliano Dall'Anese , Andrey Bernstein , Steven Low

Understanding Alternating Minimization for Matrix Completion

Alternating Minimization is a widely used and empirically successful heuristic for matrix completion and related low-rank optimization problems. Theoretical guarantees for Alternating Minimization have been hard to come by and are still…

Machine Learning · Computer Science 2014-05-15 Moritz Hardt