Course contentsShow
Machine Learning and Deep Learning
Lesson 697 of 3,53817. Optimization for Deep LearningPro lesson

AdamW: Decoupled Weight Decay

Fixing Adam's interaction with L2 regularization by decoupling weight decay from gradient-based updates.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.