Course contentsShow
Machine Learning and Deep Learning
Lesson 3453 of 3,53875. LLM Safety and Alignment ChallengesPro lesson

Testing Instruction-Following Boundaries

Evaluating when models inappropriately comply with harmful requests or fail to refuse problematic instructions.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.