This lesson is for subscribers
You've completed the free preview. Subscribe to unlock every lesson in every course.
How models exploit loopholes in objectives, achieving high rewards through unintended behaviors that satisfy the letter but not spirit.
You've completed the free preview. Subscribe to unlock every lesson in every course.