English
Related papers

Related papers: Refining Adaptive Zeroth-Order Optimization at Eas…

200 papers

The adaptive momentum method (AdaMM), which uses past gradients to update descent directions and learning rates simultaneously, has become one of the most popular first-order optimization methods for solving machine learning problems.…

Machine Learning · Computer Science 2019-10-17 Xiangyi Chen , Sijia Liu , Kaidi Xu , Xingguo Li , Xue Lin , Mingyi Hong , David Cox

We investigate the effectiveness of adaptive zeroth-order (ZO) optimization for memory-constrained fine-tuning of large language models (LLMs). Contrary to prior claims, we show that adaptive ZO methods such as ZO-Adam offer no convergence…

Machine Learning · Computer Science 2026-05-06 Hassan Dbouk , Nidham Gazagnadou , Matthias Reisser , Christos Louizos

Zeroth-Order optimization presents a promising memory-efficient paradigm for fine-tuning Large Language Models by relying solely on forward passes. However, its practical adoption is severely constrained by slow wall-clock convergence and…

Machine Learning · Computer Science 2026-04-21 Fei Wang , Li Shen , Liang Ding , Chao Xue , Ye Liu , Changxing Ding

Zeroth-order (ZO) optimization is popular in real-world applications that accessing the gradient information is expensive or unavailable. Recently, adaptive ZO methods that normalize gradient estimators by the empirical standard deviation…

Optimization and Control · Mathematics 2026-02-03 Haishan Ye , Luo Luo

Zeroth-order optimization (ZO) has been a powerful framework for solving black-box problems, which estimates gradients using zeroth-order data to update variables iteratively. The practical applicability of ZO critically depends on the…

Optimization and Control · Mathematics 2026-03-03 Ruiyang Jin , Yuke Zhou , Yujie Tang , Jie Song , Siyang Gao

As application demands for zeroth-order (gradient-free) optimization accelerate, the need for variance reduced and faster converging approaches is also intensifying. This paper addresses these challenges by presenting: a) a comprehensive…

Machine Learning · Computer Science 2018-06-08 Sijia Liu , Bhavya Kailkhura , Pin-Yu Chen , Paishun Ting , Shiyu Chang , Lisa Amini

Zeroth-Order (ZO) optimization has emerged as a promising solution for fine-tuning LLMs under strict memory constraints, as it avoids the prohibitive memory cost of storing activations for backpropagation. However, existing ZO methods…

Machine Learning · Computer Science 2026-05-25 Wei Lin , Yining Jiang , Qingyu Song , Qiao Xiang , Hong Xu

Zeroth-order (ZO) optimization is one key technique for machine learning problems where gradient calculation is expensive or impossible. Several variance reduced ZO proximal algorithms have been proposed to speed up ZO optimization for…

Optimization and Control · Mathematics 2024-10-04 Bin Gu , Xiyuan Wei , Hualin Zhang , Yi Chang , Heng Huang

Fine-tuning LLMs is necessary for various dedicated downstream tasks, but classic backpropagation-based fine-tuning methods require substantial GPU memory. To this end, a recent work, MeZO, which relies solely on forward passes to fine-tune…

Machine Learning · Computer Science 2026-05-04 Zhijie Cai , Haolong Chen , Guangxu Zhu

Lowering the memory requirement in full-parameter training on large models has become a hot research area. MeZO fine-tunes the large language models (LLMs) by just forward passes in a zeroth-order SGD optimizer (ZO-SGD), demonstrating…

Machine Learning · Computer Science 2023-12-27 Shuoran Jiang , Qingcai Chen , Youchen Pan , Yang Xiang , Yukang Lin , Xiangping Wu , Chuanyi Liu , Xiaobao Song

A major challenge of applying zeroth-order (ZO) methods is the high query complexity, especially when queries are costly. We propose a novel gradient estimation technique for ZO methods based on adaptive lazy queries that we term as LAZO.…

Machine Learning · Computer Science 2022-06-16 Quan Xiao , Qing Ling , Tianyi Chen

Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many signal processing and machine learning applications. It is used for solving optimization problems similarly to gradient-based methods. However, it…

Machine Learning · Computer Science 2020-06-23 Sijia Liu , Pin-Yu Chen , Bhavya Kailkhura , Gaoyuan Zhang , Alfred Hero , Pramod K. Varshney

Optimizing large-scale nonconvex problems, common in deep learning, demands balancing rapid convergence with computational efficiency. First-order (FO) optimizers, which serve as today's baselines, provide fast convergence and good…

Machine Learning · Computer Science 2025-09-30 Jiahe Chen , Ziye Ma

Zeroth-order optimization (ZO) is widely used for solving black-box optimization and control problems. In particular, single-point ZO (SZO) is well-suited to online or dynamic problem settings due to its requirement of only a single…

Optimization and Control · Mathematics 2026-02-06 Xin Chen , Zhaolin Ren

In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method…

Optimization and Control · Mathematics 2022-01-19 Feihu Huang , Shangqian Gao , Jian Pei , Heng Huang

Zeroth-order optimization (ZO) has demonstrated remarkable promise in efficient fine-tuning tasks for Large Language Models (LLMs). In particular, recent advances incorporate the low-rankness of gradients, introducing low-rank ZO estimators…

Machine Learning · Computer Science 2025-02-03 Yan Sun , Tiansheng Huang , Liang Ding , Li Shen , Dacheng Tao

Many important machine learning applications amount to solving minimax optimization problems, and in many cases there is no access to the gradient information, but only the function values. In this paper, we focus on such a gradient-free…

Machine Learning · Computer Science 2021-03-23 Tengyu Xu , Zhe Wang , Yingbin Liang , H. Vincent Poor

Large language models (LLMs) excel across various tasks, but standard first-order (FO) fine-tuning demands considerable memory, significantly limiting real-world deployment. Recently, zeroth-order (ZO) optimization stood out as a promising…

Machine Learning · Computer Science 2025-11-04 Qitao Tan , Jun Liu , Zheng Zhan , Caiwei Ding , Yanzhi Wang , Xiaolong Ma , Jaewoo Lee , Jin Lu , Geng Yuan

Zeroth-order (ZO) optimization, learning from finite differences of function evaluations without backpropagation, has recently regained attention in deep learning due to its memory efficiency and applicability to gray- or black-box…

Zeroth-order (ZO) optimization has emerged as a promising alternative to gradient-based backpropagation methods, particularly for black-box optimization and large language model (LLM) fine-tuning. However, ZO methods often suffer from slow…

Machine Learning · Computer Science 2025-05-26 Sihwan Park , Jihun Yun , SungYub Kim , Souvik Kundu , Eunho Yang
‹ Prev 1 2 3 10 Next ›