Skip to content

Sharpness-Aware Optimizers

Sharpness-aware methods seek parameters that lie in neighborhoods with uniformly low loss rather than at isolated minima, which tends to improve generalization. Introduced by SAM (Foret et al., ICLR 2021), these methods wrap a base optimizer such as SGD or AdamW and add a gradient ascent perturbation step before the descent update. Later work makes the perturbation scale-invariant, closes the surrogate gap, reweights the sharpness term, amortizes the extra forward-backward cost, or extends the idea to second-order optimization.

Optimizer Venue Paper Code zij
SAM ICLR 2021 Sharpness-Aware Minimization for Efficiently Improving Generalization community SAM
ASAM ICML 2021 ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks community ASAM
ESAM ICLR 2022 Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
GSAM ICLR 2022 Surrogate Gap Minimization Improves Sharpness-Aware Training official GSAM
LookSAM CVPR 2022 Towards Efficient and Scalable Sharpness-Aware Minimization community LookSAM
AE-SAM ICLR 2023 An Adaptive Policy to Employ Sharpness-Aware Minimization
bSAM ICLR 2023 SAM as an Optimal Relaxation of Bayes official
GAM CVPR 2023 Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
WSAM KDD 2023 Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term official WSAM
AdaSAM Neural Networks 2024 AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks
F-SAM CVPR 2024 Friendly Sharpness-Aware Minimization official
FGSAM NeurIPS 2024 Fast Graph Sharpness-Aware Minimization for Enhancing and Accelerating Few-Shot Node Classification
Lookbehind-SAM ICML 2024 Lookbehind-SAM: k steps back, 1 step forward
MSAM arXiv 2024 Momentum-SAM: Sharpness Aware Minimization without Computational Overhead official
SAMPa NeurIPS 2024 SAMPa: Sharpness-aware Minimization Parallelized
AsyncSAM arXiv 2025 Asynchronous Sharpness-Aware Minimization For Fast and Accurate Deep Learning
GCSAM arXiv 2025 GCSAM: Gradient Centralized Sharpness Aware Minimization official
LightSAM arXiv 2025 LightSAM: Parameter-Agnostic Sharpness-Aware Minimization
SASSHA ICML 2025 SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation official
SSAM JMLR 2025 Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy
SAM-Polyak (Adaptive SAM with Polyak step size) ICML 2026 Adaptive Sharpness-Aware Minimization with a Polyak-type Step size: A Theory-Grounded Scheduler official
X-SAM arXiv 2026 X-SAM: Boosting Sharpness-Aware Minimization with Dominant-Eigenvector Gradient Correction
M-SAM (Modality-Aware SAM) NeurIPS 2025 Modality-Aware SAM: Sharpness-Aware-Minimization Driven Gradient Modulation for Harmonized Multimodal Learning
ZSharp (SAM with Z-Score Gradient Filtering) NeurIPS 2025 OPT Workshop (also accepted to ICASSP 2026) Sharpness-Aware Minimization with Z-Score Gradient Filtering official
Focal-SAM ICML 2025 Focal-SAM: Focal Sharpness-Aware Minimization for Long-Tailed Classification official
Functional SAM ICML 2025 Avoiding spurious sharpness minimization broadens applicability of SAM
FedGMT ICML 2025 One Arrow, Two Hawks: Sharpness-aware Minimization for Federated Learning via Global Model Trajectory official
LE-SAM ICML 2026 Fix the Loss, Not the Radius: Rethinking the Adversarial Perturbation of Sharpness-Aware Minimization