Course contentsShow
Machine Learning and Deep Learning
Lesson 2199 of 3,53847. Reinforcement Learning: Temporal Difference MethodsPro lesson

Sample-Average Method

Computing action values by averaging observed rewards and understanding incremental update rules.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.