Related papers: Zero-Order Sharpness-Aware Minimization

ZO-SAM: Zero-Order Sharpness-Aware Minimization for Efficient Sparse Training

Deep learning models, despite their impressive achievements, suffer from high computational costs and memory requirements, limiting their usability in resource-constrained environments. Sparse neural networks significantly alleviate these…

Machine Learning · Computer Science 2026-03-16 Jie Ji , Gen Li , Kaiyuan Deng , Fatemeh Afghah , Xiaolong Ma

Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms

Zeroth-order (ZO) optimization is one key technique for machine learning problems where gradient calculation is expensive or impossible. Several variance reduced ZO proximal algorithms have been proposed to speed up ZO optimization for…

Optimization and Control · Mathematics 2024-10-04 Bin Gu , Xiyuan Wei , Hualin Zhang , Yi Chang , Heng Huang

A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning

Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many signal processing and machine learning applications. It is used for solving optimization problems similarly to gradient-based methods. However, it…

Machine Learning · Computer Science 2020-06-23 Sijia Liu , Pin-Yu Chen , Bhavya Kailkhura , Gaoyuan Zhang , Alfred Hero , Pramod K. Varshney

Powering Up Zeroth-Order Training via Subspace Gradient Orthogonalization

Zeroth-order (ZO) optimization provides a gradient-free alternative to first-order (FO) methods by estimating gradients via finite differences of function evaluations, and has recently emerged as a memory-efficient paradigm for fine-tuning…

Machine Learning · Computer Science 2026-02-24 Yicheng Lang , Changsheng Wang , Yihua Zhang , Mingyi Hong , Zheng Zhang , Wotao Yin , Sijia Liu

Zeroth-Order Regularized Optimization (ZORO): Approximately Sparse Gradients and Adaptive Sampling

We consider the problem of minimizing a high-dimensional objective function, which may include a regularization term, using (possibly noisy) evaluations of the function. Such optimization is also called derivative-free, zeroth-order, or…

Optimization and Control · Mathematics 2023-03-20 HanQin Cai , Daniel Mckenzie , Wotao Yin , Zhenliang Zhang

ZERA: Zero-init Instruction Evolving Refinement Agent -- From Zero Instructions to Structured Prompts via Principle-based Optimization

Automatic Prompt Optimization (APO) improves large language model (LLM) performance by refining prompts for specific tasks. However, prior APO methods typically focus only on user prompts, rely on unstructured feedback, and require large…

Computation and Language · Computer Science 2025-09-26 Seungyoun Yi , Minsoo Khang , Sungrae Park

Zeroth-Order Optimization Finds Flat Minima

Zeroth-order methods are extensively used in machine learning applications where gradients are infeasible or expensive to compute, such as black-box attacks, reinforcement learning, and language model fine-tuning. Existing optimization…

Machine Learning · Computer Science 2025-11-12 Liang Zhang , Bingcong Li , Kiran Koshy Thekumparampil , Sewoong Oh , Michael Muehlebach , Niao He

Query-Efficient Zeroth-Order Algorithms for Nonconvex Constrained Optimization

Zeroth-order optimization (ZO) has been a powerful framework for solving black-box problems, which estimates gradients using zeroth-order data to update variables iteratively. The practical applicability of ZO critically depends on the…

Optimization and Control · Mathematics 2026-03-03 Ruiyang Jin , Yuke Zhou , Yujie Tang , Jie Song , Siyang Gao

Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models

Fine-tuning is powerful for adapting large language models to downstream tasks, but it often results in huge memory usages. A promising approach to mitigate this is using Zeroth-Order (ZO) optimization, which estimates gradients to replace…

Machine Learning · Computer Science 2024-10-15 Fei Wang , Li Shen , Liang Ding , Chao Xue , Ye Liu , Changxing Ding

Localized Zeroth-Order Prompt Optimization

The efficacy of large language models (LLMs) in understanding and generating natural language has aroused a wide interest in developing prompt-based methods to harness the power of black-box LLMs. Existing methodologies usually prioritize a…

Artificial Intelligence · Computer Science 2024-03-06 Wenyang Hu , Yao Shu , Zongmin Yu , Zhaoxuan Wu , Xiangqiang Lin , Zhongxiang Dai , See-Kiong Ng , Bryan Kian Hsiang Low

Prior-Informed Zeroth-Order Optimization with Adaptive Direction Alignment for Memory-Efficient LLM Fine-Tuning

Fine-tuning large language models (LLMs) has achieved remarkable success across various NLP tasks, but the substantial memory overhead during backpropagation remains a critical bottleneck, especially as model scales grow. Zeroth-order (ZO)…

Computation and Language · Computer Science 2026-01-09 Feihu Jin , Shipeng Cen , Ying Tan

Sparse Perturbations for Improved Convergence in Stochastic Zeroth-Order Optimization

Interest in stochastic zeroth-order (SZO) methods has recently been revived in black-box optimization scenarios such as adversarial black-box attacks to deep neural networks. SZO methods only require the ability to evaluate the objective…

Machine Learning · Statistics 2020-11-11 Mayumi Ohta , Nathaniel Berger , Artem Sokolov , Stefan Riezler

SAMOSA: Sharpness Aware Minimization for Open Set Active learning

Modern machine learning solutions require extensive data collection where labeling remains costly. To reduce this burden, open set active learning approaches aim to select informative samples from a large pool of unlabeled data that…

Machine Learning · Computer Science 2025-10-27 Young In Kim , Andrea Agiollo , Rajiv Khanna

SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes

Fine-tuning vision language models (VLMs) has achieved remarkable performance across various downstream tasks; yet, it requires access to model gradients through backpropagation (BP), making them unsuitable for memory-constrained,…

Machine Learning · Computer Science 2025-10-27 Yifan Yang , Zhen Zhang , Rupak Vignesh Swaminathan , Jing Liu , Nathan Susanj , Zheng Zhang

Why Does Adaptive Zeroth-Order Optimization Work?

Zeroth-order (ZO) optimization is popular in real-world applications that accessing the gradient information is expensive or unavailable. Recently, adaptive ZO methods that normalize gradient estimators by the empirical standard deviation…

Optimization and Control · Mathematics 2026-02-03 Haishan Ye , Luo Luo

ZOQO: Zero-Order Quantized Optimization

The increasing computational and memory demands in deep learning present significant challenges, especially in resource-constrained environments. We introduce a zero-order quantized optimization (ZOQO) method designed for training models…

Machine Learning · Computer Science 2025-01-14 Noga Bar , Raja Giryes

Zeroth-Order Sharpness-Aware Learning with Exponential Tilting

Classic zeroth-order optimization approaches typically optimize for a smoothed version of the original function, i.e., the expected objective under randomly perturbed model parameters. This can be interpreted as encouraging the loss values…

Machine Learning · Computer Science 2025-10-21 Xuchen Gong , Tian Li

Zeroth-Order Fine-Tuning of LLMs in Random Subspaces

Fine-tuning Large Language Models (LLMs) has proven effective for a variety of downstream tasks. However, as LLMs grow in size, the memory demands for backpropagation become increasingly prohibitive. Zeroth-order (ZO) optimization methods…

Machine Learning · Computer Science 2025-07-25 Ziming Yu , Pan Zhou , Sike Wang , Jia Li , Mi Tian , Hua Huang

Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence

Many important machine learning applications amount to solving minimax optimization problems, and in many cases there is no access to the gradient information, but only the function values. In this paper, we focus on such a gradient-free…

Machine Learning · Computer Science 2021-03-23 Tengyu Xu , Zhe Wang , Yingbin Liang , H. Vincent Poor

Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization

As application demands for zeroth-order (gradient-free) optimization accelerate, the need for variance reduced and faster converging approaches is also intensifying. This paper addresses these challenges by presenting: a) a comprehensive…

Machine Learning · Computer Science 2018-06-08 Sijia Liu , Bhavya Kailkhura , Pin-Yu Chen , Paishun Ting , Shiyu Chang , Lisa Amini