Sharpness-Aware Minimization (SAM) in 7 minutes

Показать описание

Thank you for checking out my video notes on the Sharpness-Aware Minimization (SAM) in 7 mins! I would love to share my ML learning journey with you.

Paper information:
- Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2020). Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412.

Please let me know in the comment section regarding any questions, points of discussion, or anything you would like see next. See you in the next video!

Yuxiang "Shawn" Wang

Рекомендации по теме

Комментарии

I applied this technique a while back for a BERT like encoder for a SSL task and got much improved results. In your experience, what kind of tasks usually have a noise loss functions that benefit from applying this technique?

franciscobarragancastro

I could harldy understand anything you just said, but love your channel, you are awesome! What should I learn if I want to fully understand this? I have some math background, not a lot, thanks!

santiagocalvo

Thanks for the awesome explanation! A quick question, how do you choose the perturbation values? Do you just sample the epsilon vector from a normal distribution?

yasaswijesekara

Sharpness-Aware Minimization (SAM) in 7 minutes

Sharpness-Aware Minimization (SAM) in 7 minutes

Sharpness-Aware Minimization (SAM): Current Method and Future Directions- Hossein Mobahi

KDD 2023 - Weighted Sharpness-Aware Minimization (WSAM)

CC-SAM CVPR'23 Presentation

[ECCV 2022] Improving generalization in federated learning by seeking flat minima

OWOS: Matthew Colbrook - Barriers of Deep Learning, Approximate Sharpness, and Smale's 18th Pro...

Research Talk: Enhancing the robustness of massive language models via invariant risk minimization

AI Frontiers: Machine Learning Breakthroughs May 3, 2025

On Large Batch Training For Deep Learning Generalization Gap And Sharp Minima

Flat Minima Optimizer for generalization performance

Prof. Wei-Chun Chiang - 3rd Online Computer Vision & Artificial Intelligence Workshop

[ICML 2024] How to Escape Sharp Minima with Random Perturbations

Wengong Jin - Domain Extrapolation via Regret Minimization | MLxMIT Talks - July 14, 2020

Improved Regularization of Convolutional Neural Networks with Cutout

NfNet: High-Performance Large-Scale Image Recognition Without Normalization

04/06/2021 -- Elan Rosenfeld (CMU)

Leena Vankadara - Scaling Insights from Infinite-Width Theory for Next Gen Architecture & Learni...

ASoC 2021: Self-Distillation Amplifies Regularization in Hilbert Space

Why is SAM Robust to Label Noise?

How to make machines that adapt quickly with Emtiyaz Khan, Team Leader at the RIKEN Center for AIP

6.858 Spring 2022 Lecture 8: Sandboxing libraries

Reading Summaries of New AI Papers - Sept 28, 2023

Hossein Mobahi: 'Differential Operators for Generating Structured Adversarial Examples'

Carlo Lucibello - Entropic algorithms and wide flat minima in neural networks