This lesson is for subscribers
You've completed the free preview. Subscribe to unlock every lesson in every course.
Distinguishing outer alignment (correct reward specification) from inner alignment (model's learned objective matching the training objective).
You've completed the free preview. Subscribe to unlock every lesson in every course.