Course contentsShow
Machine Learning and Deep Learning
Lesson 2242 of 3,53848. Deep Reinforcement Learning: Value-BasedPro lesson

Computing Target Q-Values

Calculate TD targets using the target network and Bellman equation: target = reward + gamma * max Q(next_state).

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.