English
Related papers

Related papers: ZO-SAM: Zero-Order Sharpness-Aware Minimization fo…

200 papers

Deep neural networks often suffer from poor generalization caused by complex and non-convex loss landscapes. One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized…

Machine Learning · Computer Science 2022-10-25 Peng Mi , Li Shen , Tianhe Ren , Yiyi Zhou , Xiaoshuai Sun , Rongrong Ji , Dacheng Tao

Deep neural networks often suffer from poor generalization due to complex and non-convex loss landscapes. Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of…

Artificial Intelligence · Computer Science 2023-07-03 Peng Mi , Li Shen , Tianhe Ren , Yiyi Zhou , Tianshuo Xu , Xiaoshuai Sun , Tongliang Liu , Rongrong Ji , Dacheng Tao

Sharpness-aware minimization (SAM) seeks the minima with a flat loss landscape to improve the generalization performance in machine learning tasks, including fine-tuning. However, its extra parameter perturbation step doubles the…

Machine Learning · Computer Science 2026-02-11 Yifei Cheng , Xianglin Yang , Guoxia Wang , Chao Huang , Fei Ma , Dianhai Yu , Xiaochun Cao , Li Shen

Interest in stochastic zeroth-order (SZO) methods has recently been revived in black-box optimization scenarios such as adversarial black-box attacks to deep neural networks. SZO methods only require the ability to evaluate the objective…

Machine Learning · Statistics 2020-11-11 Mayumi Ohta , Nathaniel Berger , Artem Sokolov , Stefan Riezler

In today's heavily overparameterized models, the value of the training loss provides few guarantees on model generalization ability. Indeed, optimizing only the training loss value, as is commonly done, can easily lead to suboptimal model…

Machine Learning · Computer Science 2021-04-30 Pierre Foret , Ariel Kleiner , Hossein Mobahi , Behnam Neyshabur

Sharpness-Aware Minimization (SAM) is an optimization method that improves generalization performance of machine learning models. Despite its superior generalization, SAM has not been actively used in real-world applications due to its…

Machine Learning · Computer Science 2025-03-17 Junhyuk Jo , Jihyun Lim , Sunwoo Lee

Zeroth-order (ZO) optimization has become a popular technique for solving machine learning (ML) problems when first-order (FO) information is difficult or impossible to obtain. However, the scalability of ZO optimization remains an open…

Overparametrized Deep Neural Networks (DNNs) often achieve astounding performances, but may potentially result in severe generalization error. Recently, the relation between the sharpness of the loss landscape and the generalization error…

Artificial Intelligence · Computer Science 2022-05-31 Jiawei Du , Hanshu Yan , Jiashi Feng , Joey Tianyi Zhou , Liangli Zhen , Rick Siow Mong Goh , Vincent Y. F. Tan

Prompt learning has become a key method for adapting large language models to specific tasks with limited data. However, traditional gradient-based optimization methods for tuning prompts are computationally intensive, posing challenges for…

Statistics Theory · Mathematics 2025-12-30 Yao Fu , Yihang Jin , Chunxia Zhang , Junmin Liu , Guang Dai , Haishan Ye

The recently proposed optimization algorithm for deep neural networks Sharpness Aware Minimization (SAM) suggests perturbing parameters before gradient calculation by a gradient ascent step to guide the optimization into parameter space…

Machine Learning · Computer Science 2025-10-03 Marlon Becker , Frederick Altrock , Benjamin Risse

Zeroth-order (ZO) optimization provides a gradient-free alternative to first-order (FO) methods by estimating gradients via finite differences of function evaluations, and has recently emerged as a memory-efficient paradigm for fine-tuning…

Machine Learning · Computer Science 2026-02-24 Yicheng Lang , Changsheng Wang , Yihua Zhang , Mingyi Hong , Zheng Zhang , Wotao Yin , Sijia Liu

Classic zeroth-order optimization approaches typically optimize for a smoothed version of the original function, i.e., the expected objective under randomly perturbed model parameters. This can be interpreted as encouraging the loss values…

Machine Learning · Computer Science 2025-10-21 Xuchen Gong , Tian Li

While fine-tuning large language models (LLMs) for specific tasks often yields impressive results, it comes at the cost of memory inefficiency due to back-propagation in gradient-based training. Memory-efficient Zeroth-order (MeZO)…

Machine Learning · Computer Science 2026-02-17 Yong Liu , Zirui Zhu , Chaoyu Gong , Minhao Cheng , Cho-Jui Hsieh , Yang You

Sharpness-Aware Minimization (SAM) has recently emerged as a robust technique for improving the accuracy of deep neural networks. However, SAM incurs a high computational cost in practice, requiring up to twice as much computation as…

Machine Learning · Computer Science 2022-10-25 Renkun Ni , Ping-yeh Chiang , Jonas Geiping , Micah Goldblum , Andrew Gordon Wilson , Tom Goldstein

Sharpness-aware minimization (SAM) has received increasing attention in computer vision since it can effectively eliminate the sharp local minima from the training trajectory and mitigate generalization degradation. However, SAM requires…

Machine Learning · Computer Science 2024-06-21 Yili Wang , Kaixiong Zhou , Ninghao Liu , Ying Wang , Xin Wang

We consider the problem of minimizing a high-dimensional objective function, which may include a regularization term, using (possibly noisy) evaluations of the function. Such optimization is also called derivative-free, zeroth-order, or…

Optimization and Control · Mathematics 2023-03-20 HanQin Cai , Daniel Mckenzie , Wotao Yin , Zhenliang Zhang

Targeting solutions over `flat' regions of the loss landscape, sharpness-aware minimization (SAM) has emerged as a powerful tool to improve generalizability of deep neural network based learning. While several SAM variants have been…

Machine Learning · Computer Science 2025-01-14 Yilang Zhang , Bingcong Li , Georgios B. Giannakis

Fine-tuning Large Language Models (LLMs) has proven effective for a variety of downstream tasks. However, as LLMs grow in size, the memory demands for backpropagation become increasingly prohibitive. Zeroth-order (ZO) optimization methods…

Machine Learning · Computer Science 2025-07-25 Ziming Yu , Pan Zhou , Sike Wang , Jia Li , Mi Tian , Hua Huang

Modern deep learning models are over-parameterized, where the optimization setup strongly affects the generalization performance. A key element of reliable optimization for these systems is the modification of the loss function.…

Machine Learning · Computer Science 2022-12-09 Kayhan Behdin , Qingquan Song , Aman Gupta , David Durfee , Ayan Acharya , Sathiya Keerthi , Rahul Mazumder

Fine-tuning is powerful for adapting large language models to downstream tasks, but it often results in huge memory usages. A promising approach to mitigate this is using Zeroth-Order (ZO) optimization, which estimates gradients to replace…

Machine Learning · Computer Science 2024-10-15 Fei Wang , Li Shen , Liang Ding , Chao Xue , Ye Liu , Changxing Ding
‹ Prev 1 2 3 10 Next ›