Related papers: Friendly Sharpness-Aware Minimization

mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization

Modern deep learning models are over-parameterized, where different optima can result in widely varying generalization performance. The Sharpness-Aware Minimization (SAM) technique modifies the fundamental loss function that steers gradient…

Machine Learning · Statistics 2023-10-03 Kayhan Behdin , Qingquan Song , Aman Gupta , Sathiya Keerthi , Ayan Acharya , Borja Ocejo , Gregory Dexter , Rajiv Khanna , David Durfee , Rahul Mazumder

Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization

Modern deep learning models are over-parameterized, where the optimization setup strongly affects the generalization performance. A key element of reliable optimization for these systems is the modification of the loss function.…

Machine Learning · Computer Science 2022-12-09 Kayhan Behdin , Qingquan Song , Aman Gupta , David Durfee , Ayan Acharya , Sathiya Keerthi , Rahul Mazumder

Towards Understanding Sharpness-Aware Minimization

Sharpness-Aware Minimization (SAM) is a recent training method that relies on worst-case weight perturbations which significantly improves generalization in various settings. We argue that the existing justifications for the success of SAM…

Machine Learning · Computer Science 2022-06-14 Maksym Andriushchenko , Nicolas Flammarion

How Does Sharpness-Aware Minimization Minimize Sharpness?

Sharpness-Aware Minimization (SAM) is a highly effective regularization technique for improving the generalization of deep neural networks for various settings. However, the underlying working of SAM remains elusive because of various…

Machine Learning · Computer Science 2023-01-06 Kaiyue Wen , Tengyu Ma , Zhiyuan Li

Fundamental Convergence Analysis of Sharpness-Aware Minimization

The paper investigates the fundamental convergence properties of Sharpness-Aware Minimization (SAM), a recently proposed gradient-based optimization method [Foret et al., 2021] that significantly improves the generalization of deep neural…

Optimization and Control · Mathematics 2024-10-22 Pham Duy Khanh , Hoang-Chau Luong , Boris S. Mordukhovich , Dat Ba Tran

GCSAM: Gradient Centralized Sharpness Aware Minimization

The generalization performance of deep neural networks (DNNs) is a critical factor in achieving robust model behavior on unseen data. Recent studies have highlighted the importance of sharpness-based measures in promoting generalization by…

Machine Learning · Computer Science 2025-01-28 Mohamed Hassan , Aleksandar Vakanski , Boyu Zhang , Min Xian

Sharpness-Aware Minimization for Efficiently Improving Generalization

In today's heavily overparameterized models, the value of the training loss provides few guarantees on model generalization ability. Indeed, optimizing only the training loss value, as is commonly done, can easily lead to suboptimal model…

Machine Learning · Computer Science 2021-04-30 Pierre Foret , Ariel Kleiner , Hossein Mobahi , Behnam Neyshabur

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Overparametrized Deep Neural Networks (DNNs) often achieve astounding performances, but may potentially result in severe generalization error. Recently, the relation between the sharpness of the loss landscape and the generalization error…

Artificial Intelligence · Computer Science 2022-05-31 Jiawei Du , Hanshu Yan , Jiashi Feng , Joey Tianyi Zhou , Liangli Zhen , Rick Siow Mong Goh , Vincent Y. F. Tan

Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning

Sharpness-Aware Minimization (SAM) has emerged as a promising alternative optimizer to stochastic gradient descent (SGD). The originally-proposed motivation behind SAM was to bias neural networks towards flatter minima that are believed to…

Machine Learning · Computer Science 2024-06-03 Jacob Mitchell Springer , Vaishnavh Nagarajan , Aditi Raghunathan

Efficient Sharpness-Aware Minimization for Molecular Graph Transformer Models

Sharpness-aware minimization (SAM) has received increasing attention in computer vision since it can effectively eliminate the sharp local minima from the training trajectory and mitigate generalization degradation. However, SAM requires…

Machine Learning · Computer Science 2024-06-21 Yili Wang , Kaixiong Zhou , Ninghao Liu , Ying Wang , Xin Wang

Unveiling m-Sharpness Through the Structure of Stochastic Gradient Noise

Sharpness-aware minimization (SAM) has emerged as a highly effective technique to improve model generalization, but its underlying principles are not fully understood. We investigate m-sharpness, where SAM performance improves monotonically…

Machine Learning · Computer Science 2026-04-03 Haocheng Luo , Mehrtash Harandi , Dinh Phung , Trung Le

Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training

Sharpness-Aware Minimization (SAM) has substantially improved the generalization of neural networks under various settings. Despite the success, its effectiveness remains poorly understood. In this work, we discover an intriguing phenomenon…

Machine Learning · Computer Science 2025-02-21 Zhanpeng Zhou , Mingze Wang , Yuchen Mao , Bingrui Li , Junchi Yan

On Statistical Properties of Sharpness-Aware Minimization: Provable Guarantees

Sharpness-Aware Minimization (SAM) is a recent optimization framework aiming to improve the deep neural network generalization, through obtaining flatter (i.e. less sharp) solutions. As SAM has been numerically successful, recent papers…

Machine Learning · Statistics 2023-05-22 Kayhan Behdin , Rahul Mazumder

Sharpness-Aware Minimization Alone can Improve Adversarial Robustness

Sharpness-Aware Minimization (SAM) is an effective method for improving generalization ability by regularizing loss sharpness. In this paper, we explore SAM in the context of adversarial robustness. We find that using only SAM can achieve…

Machine Learning · Computer Science 2023-07-04 Zeming Wei , Jingyu Zhu , Yihao Zhang

Bilateral Sharpness-Aware Minimization for Flatter Minima

Sharpness-Aware Minimization (SAM) enhances generalization by reducing a Max-Sharpness (MaxS). Despite the practical success, we empirically found that the MAxS behind SAM's generalization enhancements face the "Flatness Indicator Problem"…

Computer Vision and Pattern Recognition · Computer Science 2024-09-23 Jiaxin Deng , Junbiao Pang , Baochang Zhang , Qingming Huang

Normalization Layers Are All That Sharpness-Aware Minimization Needs

Sharpness-aware minimization (SAM) was proposed to reduce sharpness of minima and has been shown to enhance generalization performance in various settings. In this work we show that perturbing only the affine normalization parameters…

Machine Learning · Computer Science 2023-11-20 Maximilian Mueller , Tiffany Vlaar , David Rolnick , Matthias Hein

Sharpness-Aware Minimization Leads to Low-Rank Features

Sharpness-aware minimization (SAM) is a recently proposed method that minimizes the sharpness of the training loss of a neural network. While its generalization improvement is well-known and is the primary motivation, we uncover an…

Machine Learning · Computer Science 2023-10-31 Maksym Andriushchenko , Dara Bahri , Hossein Mobahi , Nicolas Flammarion

Why Does Sharpness-Aware Minimization Generalize Better Than SGD?

The challenge of overfitting, in which the model memorizes the training data and fails to generalize to test data, has become increasingly significant in the training of large neural networks. To tackle this challenge, Sharpness-Aware…

Machine Learning · Computer Science 2023-10-12 Zixiang Chen , Junkai Zhang , Yiwen Kou , Xiangning Chen , Cho-Jui Hsieh , Quanquan Gu

GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization for Improved Generalization

Recently, Sharpness-Aware Minimization (SAM) algorithm has shown state-of-the-art generalization abilities in vision tasks. It demonstrates that flat minima tend to imply better generalization abilities. However, it has some difficulty…

Machine Learning · Computer Science 2022-10-14 Zhiyuan Zhang , Ruixuan Luo , Qi Su , Xu Sun

Sharpness-Aware Minimization with Dynamic Reweighting

Deep neural networks are often overparameterized and may not easily achieve model generalization. Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen…

Machine Learning · Computer Science 2022-12-07 Wenxuan Zhou , Fangyu Liu , Huan Zhang , Muhao Chen