Course contentsShow
Machine Learning and Deep Learning
Lesson 2141 of 3,53846. Reinforcement Learning: FundamentalsPro lesson

Return and Cumulative Reward

Computing the discounted sum of future rewards: G_t = Σ γ^k R_{t+k+1} from timestep t.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.