Course contentsShow
Machine Learning and Deep Learning
Lesson 2749 of 3,53859. Distributed Training: Data ParallelismPro lesson

ZeRO-Offload: CPU Memory Extension

Offloading optimizer states and gradients to CPU memory to train larger models on limited GPU resources.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.