This lesson is for subscribers
You've completed the free preview. Subscribe to unlock every lesson in every course.
Why directly optimizing policies can be more effective than learning value functions in certain environments.
You've completed the free preview. Subscribe to unlock every lesson in every course.