Related papers: Sharpness-Aware Minimization Improves Language Mod…

GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization for Improved Generalization

Recently, Sharpness-Aware Minimization (SAM) algorithm has shown state-of-the-art generalization abilities in vision tasks. It demonstrates that flat minima tend to imply better generalization abilities. However, it has some difficulty…

Machine Learning · Computer Science 2022-10-14 Zhiyuan Zhang , Ruixuan Luo , Qi Su , Xu Sun

Sharpness-Aware Minimization for Efficiently Improving Generalization

In today's heavily overparameterized models, the value of the training loss provides few guarantees on model generalization ability. Indeed, optimizing only the training loss value, as is commonly done, can easily lead to suboptimal model…

Machine Learning · Computer Science 2021-04-30 Pierre Foret , Ariel Kleiner , Hossein Mobahi , Behnam Neyshabur

Model Generalization: A Sharpness Aware Optimization Perspective

Sharpness-Aware Minimization (SAM) and adaptive sharpness-aware minimization (ASAM) aim to improve the model generalization. And in this project, we proposed three experiments to valid their generalization from the sharpness aware…

Machine Learning · Computer Science 2022-08-16 Jozef Marus Coldenhoff , Chengkun Li , Yurui Zhu

1st-Order Magic: Analysis of Sharpness-Aware Minimization

Sharpness-Aware Minimization (SAM) is an optimization technique designed to improve generalization by favoring flatter loss minima. To achieve this, SAM optimizes a modified objective that penalizes sharpness, using computationally…

Machine Learning · Computer Science 2024-11-05 Nalin Tiwary , Siddarth Aananth

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models

Fine-tuning large pretrained language models on a limited training corpus usually suffers from poor generalization. Prior works show that the recently-proposed sharpness-aware minimization (SAM) optimization method can improve the model…

Computation and Language · Computer Science 2022-10-12 Qihuang Zhong , Liang Ding , Li Shen , Peng Mi , Juhua Liu , Bo Du , Dacheng Tao

Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning

Sharpness-Aware Minimization (SAM) has emerged as a promising alternative optimizer to stochastic gradient descent (SGD). The originally-proposed motivation behind SAM was to bias neural networks towards flatter minima that are believed to…

Machine Learning · Computer Science 2024-06-03 Jacob Mitchell Springer , Vaishnavh Nagarajan , Aditi Raghunathan

On Statistical Properties of Sharpness-Aware Minimization: Provable Guarantees

Sharpness-Aware Minimization (SAM) is a recent optimization framework aiming to improve the deep neural network generalization, through obtaining flatter (i.e. less sharp) solutions. As SAM has been numerically successful, recent papers…

Machine Learning · Statistics 2023-05-22 Kayhan Behdin , Rahul Mazumder

mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization

Modern deep learning models are over-parameterized, where different optima can result in widely varying generalization performance. The Sharpness-Aware Minimization (SAM) technique modifies the fundamental loss function that steers gradient…

Machine Learning · Statistics 2023-10-03 Kayhan Behdin , Qingquan Song , Aman Gupta , Sathiya Keerthi , Ayan Acharya , Borja Ocejo , Gregory Dexter , Rajiv Khanna , David Durfee , Rahul Mazumder

Towards Understanding Sharpness-Aware Minimization

Sharpness-Aware Minimization (SAM) is a recent training method that relies on worst-case weight perturbations which significantly improves generalization in various settings. We argue that the existing justifications for the success of SAM…

Machine Learning · Computer Science 2022-06-14 Maksym Andriushchenko , Nicolas Flammarion

Tilted Sharpness-Aware Minimization

Sharpness-Aware Minimization (SAM) has been demonstrated to improve the generalization performance of overparameterized models by seeking flat minima on the loss landscape through optimizing model parameters that incur the largest loss…

Machine Learning · Computer Science 2025-06-10 Tian Li , Tianyi Zhou , Jeffrey A. Bilmes

Why Does Sharpness-Aware Minimization Generalize Better Than SGD?

The challenge of overfitting, in which the model memorizes the training data and fails to generalize to test data, has become increasingly significant in the training of large neural networks. To tackle this challenge, Sharpness-Aware…

Machine Learning · Computer Science 2023-10-12 Zixiang Chen , Junkai Zhang , Yiwen Kou , Xiangning Chen , Cho-Jui Hsieh , Quanquan Gu

Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization

Modern deep learning models are over-parameterized, where the optimization setup strongly affects the generalization performance. A key element of reliable optimization for these systems is the modification of the loss function.…

Machine Learning · Computer Science 2022-12-09 Kayhan Behdin , Qingquan Song , Aman Gupta , David Durfee , Ayan Acharya , Sathiya Keerthi , Rahul Mazumder

Do Sharpness-based Optimizers Improve Generalization in Medical Image Analysis?

Effective clinical deployment of deep learning models in healthcare demands high generalization performance to ensure accurate diagnosis and treatment planning. In recent years, significant research has focused on improving the…

Image and Video Processing · Electrical Eng. & Systems 2025-10-22 Mohamed Hassan , Aleksandar Vakanski , Min Xian

Normalization Layers Are All That Sharpness-Aware Minimization Needs

Sharpness-aware minimization (SAM) was proposed to reduce sharpness of minima and has been shown to enhance generalization performance in various settings. In this work we show that perturbing only the affine normalization parameters…

Machine Learning · Computer Science 2023-11-20 Maximilian Mueller , Tiffany Vlaar , David Rolnick , Matthias Hein

Avoiding spurious sharpness minimization broadens applicability of SAM

Curvature regularization techniques like Sharpness Aware Minimization (SAM) have shown great promise in improving generalization on vision tasks. However, we find that SAM performs poorly in domains like natural language processing (NLP),…

Machine Learning · Computer Science 2025-02-05 Sidak Pal Singh , Hossein Mobahi , Atish Agarwala , Yann Dauphin

How Does Sharpness-Aware Minimization Minimize Sharpness?

Sharpness-Aware Minimization (SAM) is a highly effective regularization technique for improving the generalization of deep neural networks for various settings. However, the underlying working of SAM remains elusive because of various…

Machine Learning · Computer Science 2023-01-06 Kaiyue Wen , Tengyu Ma , Zhiyuan Li

Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization

Deep neural networks have been increasingly used in safety-critical applications such as medical diagnosis and autonomous driving. However, many studies suggest that they are prone to being poorly calibrated and have a propensity for…

Machine Learning · Computer Science 2025-06-02 Chengli Tan , Yubo Zhou , Haishan Ye , Guang Dai , Junmin Liu , Zengjie Song , Jiangshe Zhang , Zixiang Zhao , Yunda Hao , Yong Xu

Sharpness-Aware Minimization: General Analysis and Improved Rates

Sharpness-Aware Minimization (SAM) has emerged as a powerful method for improving generalization in machine learning models by minimizing the sharpness of the loss landscape. However, despite its success, several important questions…

Optimization and Control · Mathematics 2025-03-05 Dimitris Oikonomou , Nicolas Loizou

Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training

Sharpness-Aware Minimization (SAM) has substantially improved the generalization of neural networks under various settings. Despite the success, its effectiveness remains poorly understood. In this work, we discover an intriguing phenomenon…

Machine Learning · Computer Science 2025-02-21 Zhanpeng Zhou , Mingze Wang , Yuchen Mao , Bingrui Li , Junchi Yan

Sharpness-Aware Machine Unlearning

We characterize the effectiveness of Sharpness-aware minimization (SAM) under machine unlearning scheme, where unlearning forget signals interferes with learning retain signals. While previous work prove that SAM improves generalization…

Machine Learning · Computer Science 2026-03-10 Haoran Tang , Rajiv Khanna