Course contentsShow
Machine Learning and Deep Learning
Lesson 3443 of 3,53875. LLM Safety and Alignment ChallengesPro lesson

Reward Model Distribution Shift

How reward models trained on limited data fail to generalize to the full space of possible model outputs.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.