Reinforcement Learning Crash Course - Dynamic Programming

preview_player
Показать описание
Reinforcement Learning Crash Course by Viviane Clay

0:03:15 Policy Iteration
0:06:00 Policy Iteration Pseudo Code
0:10:05 Policy Iteration Exercise
0:11:00 Exercise Solution
0:12:10 Value Iteration + Pseudocode
0:15:00 Dynamic Programming Exercise
0:16:20 Exercise Solution
Рекомендации по теме
Комментарии
Автор

At 8:38 there should be: change = MIN(change, abs(old_V - V(S))) instead of MAX(...)

asozykin
welcome to shbcf.ru