Course contentsShow
Machine Learning and Deep Learning
Lesson 675 of 3,53816. Activation Functions and Weight InitializationPro lesson

The Vanishing Gradient Problem

Understanding why gradients can exponentially decay in deep networks, making early layers difficult to train.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.