Course contentsShow
Machine Learning and Deep Learning
Lesson 1808 of 3,53838. Instruction Tuning and AlignmentPro lesson

The Reference Model in DPO

Why DPO requires a frozen reference model and how it prevents over-optimization.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.