Course contentsShow
Machine Learning and Deep Learning
Lesson 2330 of 3,53850. Deep Reinforcement Learning: Advanced Policy MethodsPro lesson

The Dynamics Model: Predicting Next States and Rewards

Learning transition functions p(s'|s,a) and reward functions to simulate environment behavior.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.