Related papers: Attributing Data for Sharpness-Aware Minimization

Sharpness-Aware Minimization for Efficiently Improving Generalization

In today's heavily overparameterized models, the value of the training loss provides few guarantees on model generalization ability. Indeed, optimizing only the training loss value, as is commonly done, can easily lead to suboptimal model…

Machine Learning · Computer Science 2021-04-30 Pierre Foret , Ariel Kleiner , Hossein Mobahi , Behnam Neyshabur

Towards Understanding Sharpness-Aware Minimization

Sharpness-Aware Minimization (SAM) is a recent training method that relies on worst-case weight perturbations which significantly improves generalization in various settings. We argue that the existing justifications for the success of SAM…

Machine Learning · Computer Science 2022-06-14 Maksym Andriushchenko , Nicolas Flammarion

Bilateral Sharpness-Aware Minimization for Flatter Minima

Sharpness-Aware Minimization (SAM) enhances generalization by reducing a Max-Sharpness (MaxS). Despite the practical success, we empirically found that the MAxS behind SAM's generalization enhancements face the "Flatness Indicator Problem"…

Computer Vision and Pattern Recognition · Computer Science 2024-09-23 Jiaxin Deng , Junbiao Pang , Baochang Zhang , Qingming Huang

Critical Influence of Overparameterization on Sharpness-aware Minimization

Sharpness-Aware Minimization (SAM) has attracted considerable attention for its effectiveness in improving generalization in deep neural network training by explicitly minimizing sharpness in the loss landscape. Its success, however, relies…

Machine Learning · Computer Science 2025-06-16 Sungbin Shin , Dongyeop Lee , Maksym Andriushchenko , Namhoon Lee

Efficient Sharpness-Aware Minimization for Molecular Graph Transformer Models

Sharpness-aware minimization (SAM) has received increasing attention in computer vision since it can effectively eliminate the sharp local minima from the training trajectory and mitigate generalization degradation. However, SAM requires…

Machine Learning · Computer Science 2024-06-21 Yili Wang , Kaixiong Zhou , Ninghao Liu , Ying Wang , Xin Wang

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

Deep neural networks often suffer from poor generalization caused by complex and non-convex loss landscapes. One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized…

Machine Learning · Computer Science 2022-10-25 Peng Mi , Li Shen , Tianhe Ren , Yiyi Zhou , Xiaoshuai Sun , Rongrong Ji , Dacheng Tao

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

Deep neural networks often suffer from poor generalization due to complex and non-convex loss landscapes. Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of…

Artificial Intelligence · Computer Science 2023-07-03 Peng Mi , Li Shen , Tianhe Ren , Yiyi Zhou , Tianshuo Xu , Xiaoshuai Sun , Tongliang Liu , Rongrong Ji , Dacheng Tao

An Adaptive Policy to Employ Sharpness-Aware Minimization

Sharpness-aware minimization (SAM), which searches for flat minima by min-max optimization, has been shown to be useful in improving model generalization. However, since each SAM update requires computing two gradients, its computational…

Machine Learning · Computer Science 2023-05-01 Weisen Jiang , Hansi Yang , Yu Zhang , James Kwok

Rethinking Sharpness-Aware Minimization as Variational Inference

Sharpness-aware minimization (SAM) aims to improve the generalisation of gradient-based learning by seeking out flat minima. In this work, we establish connections between SAM and Mean-Field Variational Inference (MFVI) of neural network…

Machine Learning · Statistics 2022-10-20 Szilvia Ujváry , Zsigmond Telek , Anna Kerekes , Anna Mészáros , Ferenc Huszár

Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization

Modern deep learning models are over-parameterized, where the optimization setup strongly affects the generalization performance. A key element of reliable optimization for these systems is the modification of the loss function.…

Machine Learning · Computer Science 2022-12-09 Kayhan Behdin , Qingquan Song , Aman Gupta , David Durfee , Ayan Acharya , Sathiya Keerthi , Rahul Mazumder

Why Does Sharpness-Aware Minimization Generalize Better Than SGD?

The challenge of overfitting, in which the model memorizes the training data and fails to generalize to test data, has become increasingly significant in the training of large neural networks. To tackle this challenge, Sharpness-Aware…

Machine Learning · Computer Science 2023-10-12 Zixiang Chen , Junkai Zhang , Yiwen Kou , Xiangning Chen , Cho-Jui Hsieh , Quanquan Gu

X-SAM: Boosting Sharpness-Aware Minimization with Dominant-Eigenvector Gradient Correction

Sharpness-Aware Minimization (SAM) aims to improve generalization by minimizing a worst-case perturbed loss over a small neighborhood of model parameters. However, during training, its optimization behavior does not always align with…

Machine Learning · Computer Science 2026-01-16 Hongru Duan , Yongle Chen , Lei Guan

Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning

Sharpness-Aware Minimization (SAM) has emerged as a promising alternative optimizer to stochastic gradient descent (SGD). The originally-proposed motivation behind SAM was to bias neural networks towards flatter minima that are believed to…

Machine Learning · Computer Science 2024-06-03 Jacob Mitchell Springer , Vaishnavh Nagarajan , Aditi Raghunathan

Unpacking the Implicit Norm Dynamics of Sharpness-Aware Minimization in Tensorized Models

Sharpness-Aware Minimization (SAM) has been proven to be an effective optimization technique for improving generalization in overparameterized models. While prior works have explored the implicit regularization of SAM in simple two-core…

Machine Learning · Computer Science 2025-08-15 Tianxiao Cao , Kyohei Atarashi , Hisashi Kashima

A Universal Class of Sharpness-Aware Minimization Algorithms

Recently, there has been a surge in interest in developing optimization algorithms for overparameterized models as achieving generalization is believed to require algorithms with suitable biases. This interest centers on minimizing…

Machine Learning · Computer Science 2026-02-05 Behrooz Tahmasebi , Ashkan Soleymani , Dara Bahri , Stefanie Jegelka , Patrick Jaillet

Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm

Targeting solutions over `flat' regions of the loss landscape, sharpness-aware minimization (SAM) has emerged as a powerful tool to improve generalizability of deep neural network based learning. While several SAM variants have been…

Machine Learning · Computer Science 2025-01-14 Yilang Zhang , Bingcong Li , Georgios B. Giannakis

Asynchronous Sharpness-Aware Minimization For Fast and Accurate Deep Learning

Sharpness-Aware Minimization (SAM) is an optimization method that improves generalization performance of machine learning models. Despite its superior generalization, SAM has not been actively used in real-world applications due to its…

Machine Learning · Computer Science 2025-03-17 Junhyuk Jo , Jihyun Lim , Sunwoo Lee

How Does Sharpness-Aware Minimization Minimize Sharpness?

Sharpness-Aware Minimization (SAM) is a highly effective regularization technique for improving the generalization of deep neural networks for various settings. However, the underlying working of SAM remains elusive because of various…

Machine Learning · Computer Science 2023-01-06 Kaiyue Wen , Tengyu Ma , Zhiyuan Li

Sharpness-Aware Training for Free

Modern deep neural networks (DNNs) have achieved state-of-the-art performances but are typically over-parameterized. The over-parameterization may result in undesirably large generalization error in the absence of other customized training…

Machine Learning · Computer Science 2023-03-03 Jiawei Du , Daquan Zhou , Jiashi Feng , Vincent Y. F. Tan , Joey Tianyi Zhou

The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima

We consider Sharpness-Aware Minimization (SAM), a gradient-based optimization method for deep networks that has exhibited performance improvements on image and language prediction problems. We show that when SAM is applied with a convex…

Machine Learning · Computer Science 2023-04-12 Peter L. Bartlett , Philip M. Long , Olivier Bousquet