Related papers: Refining Adaptive Zeroth-Order Optimization at Eas…

ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization

The adaptive momentum method (AdaMM), which uses past gradients to update descent directions and learning rates simultaneously, has become one of the most popular first-order optimization methods for solving machine learning problems.…

Machine Learning · Computer Science 2019-10-17 Xiangyi Chen , Sijia Liu , Kaidi Xu , Xingguo Li , Xue Lin , Mingyi Hong , David Cox

On Adaptivity in Zeroth-Order Optimization

We investigate the effectiveness of adaptive zeroth-order (ZO) optimization for memory-constrained fine-tuning of large language models (LLMs). Contrary to prior claims, we show that adaptive ZO methods such as ZO-Adam offer no convergence…

Machine Learning · Computer Science 2026-05-06 Hassan Dbouk , Nidham Gazagnadou , Matthias Reisser , Christos Louizos

Universally Empowering Zeroth-Order Optimization via Adaptive Layer-wise Sampling

Zeroth-Order optimization presents a promising memory-efficient paradigm for fine-tuning Large Language Models by relying solely on forward passes. However, its practical adoption is severely constrained by slow wall-clock convergence and…

Machine Learning · Computer Science 2026-04-21 Fei Wang , Li Shen , Liang Ding , Chao Xue , Ye Liu , Changxing Ding

Why Does Adaptive Zeroth-Order Optimization Work?

Zeroth-order (ZO) optimization is popular in real-world applications that accessing the gradient information is expensive or unavailable. Recently, adaptive ZO methods that normalize gradient estimators by the empirical standard deviation…

Optimization and Control · Mathematics 2026-02-03 Haishan Ye , Luo Luo

Query-Efficient Zeroth-Order Algorithms for Nonconvex Constrained Optimization

Zeroth-order optimization (ZO) has been a powerful framework for solving black-box problems, which estimates gradients using zeroth-order data to update variables iteratively. The practical applicability of ZO critically depends on the…

Optimization and Control · Mathematics 2026-03-03 Ruiyang Jin , Yuke Zhou , Yujie Tang , Jie Song , Siyang Gao

Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization

As application demands for zeroth-order (gradient-free) optimization accelerate, the need for variance reduced and faster converging approaches is also intensifying. This paper addresses these challenges by presenting: a) a comprehensive…

Machine Learning · Computer Science 2018-06-08 Sijia Liu , Bhavya Kailkhura , Pin-Yu Chen , Paishun Ting , Shiyu Chang , Lisa Amini

AGZO: Activation-Guided Zeroth-Order Optimization for LLM Fine-Tuning

Zeroth-Order (ZO) optimization has emerged as a promising solution for fine-tuning LLMs under strict memory constraints, as it avoids the prohibitive memory cost of storing activations for backpropagation. However, existing ZO methods…

Machine Learning · Computer Science 2026-05-25 Wei Lin , Yining Jiang , Qingyu Song , Qiao Xiang , Hong Xu

Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms

Zeroth-order (ZO) optimization is one key technique for machine learning problems where gradient calculation is expensive or impossible. Several variance reduced ZO proximal algorithms have been proposed to speed up ZO optimization for…

Optimization and Control · Mathematics 2024-10-04 Bin Gu , Xiyuan Wei , Hualin Zhang , Yi Chang , Heng Huang

AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments

Fine-tuning LLMs is necessary for various dedicated downstream tasks, but classic backpropagation-based fine-tuning methods require substantial GPU memory. To this end, a recent work, MeZO, which relies solely on forward passes to fine-tune…

Machine Learning · Computer Science 2026-05-04 Zhijie Cai , Haolong Chen , Guangxu Zhu

ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-order Optimization

Lowering the memory requirement in full-parameter training on large models has become a hot research area. MeZO fine-tunes the large language models (LLMs) by just forward passes in a zeroth-order SGD optimizer (ZO-SGD), demonstrating…

Machine Learning · Computer Science 2023-12-27 Shuoran Jiang , Qingcai Chen , Youchen Pan , Yang Xiang , Yukang Lin , Xiangping Wu , Chuanyi Liu , Xiaobao Song

Lazy Queries Can Reduce Variance in Zeroth-order Optimization

A major challenge of applying zeroth-order (ZO) methods is the high query complexity, especially when queries are costly. We propose a novel gradient estimation technique for ZO methods based on adaptive lazy queries that we term as LAZO.…

Machine Learning · Computer Science 2022-06-16 Quan Xiao , Qing Ling , Tianyi Chen

A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning

Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many signal processing and machine learning applications. It is used for solving optimization problems similarly to gradient-based methods. However, it…

Machine Learning · Computer Science 2020-06-23 Sijia Liu , Pin-Yu Chen , Bhavya Kailkhura , Gaoyuan Zhang , Alfred Hero , Pramod K. Varshney

VAMO: Efficient Zeroth-Order Variance Reduction for SGD with Faster Convergence

Optimizing large-scale nonconvex problems, common in deep learning, demands balancing rapid convergence with computational efficiency. First-order (FO) optimizers, which serve as today's baselines, provide fast convergence and good…

Machine Learning · Computer Science 2025-09-30 Jiahe Chen , Ziye Ma

Accelerating Single-Point Zeroth-Order Optimization with Regression-Based Gradient Surrogates

Zeroth-order optimization (ZO) is widely used for solving black-box optimization and control problems. In particular, single-point ZO (SZO) is well-suited to online or dynamic problem settings due to its requirement of only a single…

Optimization and Control · Mathematics 2026-02-06 Xin Chen , Zhaolin Ren

Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization

In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method…

Optimization and Control · Mathematics 2022-01-19 Feihu Huang , Shangqian Gao , Jian Pei , Heng Huang

TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs

Zeroth-order optimization (ZO) has demonstrated remarkable promise in efficient fine-tuning tasks for Large Language Models (LLMs). In particular, recent advances incorporate the low-rankness of gradients, introducing low-rank ZO estimators…

Machine Learning · Computer Science 2025-02-03 Yan Sun , Tiansheng Huang , Liang Ding , Li Shen , Dacheng Tao

Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence

Many important machine learning applications amount to solving minimax optimization problems, and in many cases there is no access to the gradient information, but only the function values. In this paper, we focus on such a gradient-free…

Machine Learning · Computer Science 2021-03-23 Tengyu Xu , Zhe Wang , Yingbin Liang , H. Vincent Poor

Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning

Large language models (LLMs) excel across various tasks, but standard first-order (FO) fine-tuning demands considerable memory, significantly limiting real-world deployment. Recently, zeroth-order (ZO) optimization stood out as a promising…

Machine Learning · Computer Science 2025-11-04 Qitao Tan , Jun Liu , Zheng Zhan , Caiwei Ding , Yanzhi Wang , Xiaolong Ma , Jaewoo Lee , Jin Lu , Geng Yuan

Position: Zeroth-Order Optimization in Deep Learning Is Underexplored, Not Underpowered

Zeroth-order (ZO) optimization, learning from finite differences of function evaluations without backpropagation, has recently regained attention in deep learning due to its memory efficiency and applicability to gray- or black-box…

Machine Learning · Computer Science 2026-05-19 Sijia Liu , Yicheng Lang , Soumyadeep Pal , Changsheng Wang , Yancheng Huang , Chongyu Fan , James Diffenderfer , Bhavya Kailkhura , Yihua Zhang

Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale

Zeroth-order (ZO) optimization has emerged as a promising alternative to gradient-based backpropagation methods, particularly for black-box optimization and large language model (LLM) fine-tuning. However, ZO methods often suffer from slow…

Machine Learning · Computer Science 2025-05-26 Sihwan Park , Jihun Yun , SungYub Kim , Souvik Kundu , Eunho Yang