Related papers: ZOQO: Zero-Order Quantized Optimization

Fine-tuning Quantized Neural Networks with Zeroth-order Optimization

As the size of large language models grows exponentially, GPU memory has become a bottleneck for adapting these models to downstream tasks. In this paper, we aim to push the limits of memory-efficient training by minimizing memory usage on…

Machine Learning · Computer Science 2026-02-13 Sifeng Shang , Jiayi Zhou , Chenyu Lin , Minxian Li , Kaiyang Zhou

QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models

Language Models (LLMs) are often quantized to lower precision to reduce the memory cost and latency in inference. However, quantization often degrades model performance, thus fine-tuning is required for various down-stream tasks.…

Machine Learning · Computer Science 2025-02-19 Jiajun Zhou , Yifan Yang , Kai Zhen , Ziyue Liu , Yequan Zhao , Ershad Banijamali , Athanasios Mouchtaris , Ngai Wong , Zheng Zhang

A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning

Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many signal processing and machine learning applications. It is used for solving optimization problems similarly to gradient-based methods. However, it…

Machine Learning · Computer Science 2020-06-23 Sijia Liu , Pin-Yu Chen , Bhavya Kailkhura , Gaoyuan Zhang , Alfred Hero , Pramod K. Varshney

Powering Up Zeroth-Order Training via Subspace Gradient Orthogonalization

Zeroth-order (ZO) optimization provides a gradient-free alternative to first-order (FO) methods by estimating gradients via finite differences of function evaluations, and has recently emerged as a memory-efficient paradigm for fine-tuning…

Machine Learning · Computer Science 2026-02-24 Yicheng Lang , Changsheng Wang , Yihua Zhang , Mingyi Hong , Zheng Zhang , Wotao Yin , Sijia Liu

Query-Efficient Zeroth-Order Algorithms for Nonconvex Constrained Optimization

Zeroth-order optimization (ZO) has been a powerful framework for solving black-box problems, which estimates gradients using zeroth-order data to update variables iteratively. The practical applicability of ZO critically depends on the…

Optimization and Control · Mathematics 2026-03-03 Ruiyang Jin , Yuke Zhou , Yujie Tang , Jie Song , Siyang Gao

End-to-End On-Device Quantization-Aware Training for LLMs at Inference Cost

Quantization is an effective technique to reduce the deployment cost of large language models (LLMs), and post-training quantization (PTQ) has been widely studied due to its efficiency. However, existing PTQ methods are limited by their…

Machine Learning · Computer Science 2025-09-30 Qitao Tan , Xiaoying Song , Jin Lu , Guoming Li , Jun Liu , Lingzi Hong , Caiwen Ding , Jundong Li , Xiaoming Zhai , Shaoyi Huang , Wei Niu , Geng Yuan

Zero-shot Adversarial Quantization

Model quantization is a promising approach to compress deep neural networks and accelerate inference, making it possible to be deployed on mobile and edge devices. To retain the high performance of full-precision models, most existing…

Computer Vision and Pattern Recognition · Computer Science 2021-03-31 Yuang Liu , Wei Zhang , Jun Wang

Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models

Fine-tuning is powerful for adapting large language models to downstream tasks, but it often results in huge memory usages. A promising approach to mitigate this is using Zeroth-Order (ZO) optimization, which estimates gradients to replace…

Machine Learning · Computer Science 2024-10-15 Fei Wang , Li Shen , Liang Ding , Chao Xue , Ye Liu , Changxing Ding

DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training

Zeroth-order (ZO) optimization has become a popular technique for solving machine learning (ML) problems when first-order (FO) information is difficult or impossible to obtain. However, the scalability of ZO optimization remains an open…

Machine Learning · Computer Science 2024-03-18 Aochuan Chen , Yimeng Zhang , Jinghan Jia , James Diffenderfer , Jiancheng Liu , Konstantinos Parasyris , Yihua Zhang , Zheng Zhang , Bhavya Kailkhura , Sijia Liu

An Empirical Evaluation of Zeroth-Order Optimization Methods on AI-driven Molecule Optimization

Molecule optimization is an important problem in chemical discovery and has been approached using many techniques, including generative modeling, reinforcement learning, genetic algorithms, and much more. Recent work has also applied…

Biomolecules · Quantitative Biology 2022-10-31 Elvin Lo , Pin-Yu Chen

Zero-shot Quantization: A Comprehensive Survey

Network quantization has proven to be a powerful approach to reduce the memory and computational demands of deep learning models for deployment on resource-constrained devices. However, traditional quantization methods often rely on access…

Computer Vision and Pattern Recognition · Computer Science 2025-05-15 Minjun Kim , Jaehyeon Choi , Jongkeun Lee , Wonjin Cho , U Kang

Warming Up for Zeroth-Order Federated Pre-Training with Low Resource Clients

Federated learning enables collaborative model training across numerous edge devices without requiring participants to share data; however, memory and communication constraints on these edge devices may preclude their participation in…

Machine Learning · Computer Science 2025-09-04 Gwen Legate , Irina Rish , Eugene Belilovsky

MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models

Large language models have demonstrated exceptional capabilities across diverse tasks, but their fine-tuning demands significant memory, posing challenges for resource-constrained environments. Zeroth-order (ZO) optimization provides a…

Machine Learning · Computer Science 2025-02-18 Zhen Zhang , Yifan Yang , Kai Zhen , Nathan Susanj , Athanasios Mouchtaris , Siegfried Kunzmann , Zheng Zhang

ZO-SAM: Zero-Order Sharpness-Aware Minimization for Efficient Sparse Training

Deep learning models, despite their impressive achievements, suffer from high computational costs and memory requirements, limiting their usability in resource-constrained environments. Sparse neural networks significantly alleviate these…

Machine Learning · Computer Science 2026-03-16 Jie Ji , Gen Li , Kaiyuan Deng , Fatemeh Afghah , Xiaolong Ma

Model Evolution Under Zeroth-Order Optimization: A Neural Tangent Kernel Perspective

Zeroth-order (ZO) optimization enables memory-efficient training of neural networks by estimating gradients via forward passes only, eliminating the need for backpropagation. However, the stochastic nature of gradient estimation…

Machine Learning · Computer Science 2026-03-24 Chen Zhang , Yuxin Cheng , Chenchen Ding , Shuqi Wang , Jingreng Lei , Runsheng Yu , Yik-Chung WU , Ngai Wong

Position: Zeroth-Order Optimization in Deep Learning Is Underexplored, Not Underpowered

Zeroth-order (ZO) optimization, learning from finite differences of function evaluations without backpropagation, has recently regained attention in deep learning due to its memory efficiency and applicability to gray- or black-box…

Machine Learning · Computer Science 2026-05-19 Sijia Liu , Yicheng Lang , Soumyadeep Pal , Changsheng Wang , Yancheng Huang , Chongyu Fan , James Diffenderfer , Bhavya Kailkhura , Yihua Zhang

A Zeroth-Order Extra-Gradient Method for Black-Box Constrained Optimization

Non-analytical objectives and constraints often arise in control systems, particularly in problems with complex dynamics, which are challenging yet lack efficient solution methods. In this work, we consider general constrained optimization…

Optimization and Control · Mathematics 2025-07-16 Yuke Zhou , Ruiyang Jin , Siyang Gao , Jianxiao Wang , Jie Song

Zeroth-Order Constrained Optimization from a Control Perspective via Feedback Linearization

Safe derivative-free optimization under unknown constraints is a fundamental challenge in modern learning and control. Existing zeroth-order (ZO) methods typically still assume access to a first-order oracle of the constraint functions or…

Optimization and Control · Mathematics 2026-01-29 Runyu Zhang , Gioele Zardini , Asuman Ozdaglar , Jeff Shamma , Na Li

ZeroQ: A Novel Zero Shot Quantization Framework

Quantization is a promising approach for reducing the inference time and memory footprint of neural networks. However, most existing quantization methods require access to the original training dataset for retraining during quantization.…

Computer Vision and Pattern Recognition · Computer Science 2020-03-29 Yaohui Cai , Zhewei Yao , Zhen Dong , Amir Gholami , Michael W. Mahoney , Kurt Keutzer

Towards solving large QUBO problems using quantum algorithms: improving the LogQ scheme

The LogQ algorithm encodes Quadratic Unconstrained Binary Optimization (QUBO) problems with exponentially fewer qubits than the Quantum Approximate Optimization Algorithm (QAOA). The advantages of conventional LogQ are accompanied by a…

Quantum Physics · Physics 2025-07-14 Yagnik Chatterjee , Jérémie Messud