Course contentsShow
Machine Learning and Deep Learning
Lesson 3414 of 3,53875. LLM Safety and Alignment ChallengesPro lesson

Direct Instruction Attacks

How adversaries use direct commands like role-playing or hypothetical scenarios to circumvent content policies.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.