This lesson is for subscribers
You've completed the free preview. Subscribe to unlock every lesson in every course.
Calculate TD targets using the target network and Bellman equation: target = reward + gamma * max Q(next_state).
You've completed the free preview. Subscribe to unlock every lesson in every course.