Course contentsShow
Machine Learning and Deep Learning
Lesson 2186 of 3,53847. Reinforcement Learning: Temporal Difference MethodsPro lesson

Greedy Action Selection and Its Limitations

How pure greedy policies exploit current estimates but fail to discover potentially better actions.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.