Course contentsShow
Machine Learning and Deep Learning
Lesson 3441 of 3,53875. LLM Safety and Alignment ChallengesPro lesson

Mode Collapse and Response Diversity

How RLHF can lead to repetitive or formulaic outputs as models converge to high-reward but stereotyped responses.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.