Bellman Equations, Dynamic Programming, Generalized Policy Iteration | Reinforcement Learning Part 2

preview_player
Показать описание

Part two of a six part series on Reinforcement Learning. We discuss the Bellman Equations, Dynamic Programming and Generalized Policy Iteration.

SOCIAL MEDIA

SOURCES

[1] R. Sutton and A. Barto. Reinforcement learning: An Introduction (2nd Ed). MIT Press, 2018.

SOURCE NOTES

The video covers the topics of Chapter 3 and 4 from [1]. The whole series teaches from [1]. [2] was a useful secondary resource.

TIMESTAMP
0:00 What We'll Learn
1:09 Review of Previous Topics
2:46 Definition of Dynamic Programming
3:05 Discovering the Bellman Equation
7:13 Bellman Optimality
8:41 A Grid View of the Bellman Equations
11:24 Policy Evaluation
13:58 Policy Improvement
15:55 Generalized Policy Iteration
17:55 A Beautiful View of GPI
18:14 The Gambler's Problem
20:42 Watch the Next Video!
Рекомендации по теме
Комментарии
Автор

Great video! Can you explain more, that "sneaky" equation in aroun 6:00? Why is G_t+1 = v(S_t+1) in the expectation?

mbeloch
Автор

I can't express how good these videos are, thank you so much for all the time you put into making them! this is a truly special channel

TheRealExecuter
Автор

So far the best and optimized playlist for reinforcement learning.

abhinavanand
Автор

Let's read from the textbook. *He opens the book, then stares at the camera and confidently recites from memory*.

mCoding
Автор

After going through some books, paid courses, I finally understand the fundamentals of RL through your video. Well done and subscribed.

kplim
Автор

Kudos, good sir. Your pedagogical skill is both impressive and efficient.
Please continue to grace the world with it for the good of all of mankind.

timothytyree
Автор

You saved lot of my time by simple, concise and easy to follow video compared to other I have seen so far.

rajatjaiswal
Автор

This is the best reinforcements learning resource available in internet, Period

manudasmd
Автор

Imagine if such great educational videos existed for all foundational topics in artificial intelligence, engineering, math, and physics. We are slowly getting there :). 3b1b py module manim has made it quite accessible to create high-quality, time efficient (for learning) educational content. It's amazing what people create. Thank you for the great videos!

valterszakrevskis
Автор

THESE ARE THE BEST VIDEOS ON THIS TOPIC EVER, AND YOUR WAY OF EXPLAINING AND MAKING THINGS SOUND SO SIMPLE IS INCREDIBLE, THANK YOU A MILLION TIMES

fzet
Автор

I really enjoyed the video!
It's really helpful to me that you teach with fluctuating intonation because otherwise I can't really focus. Good job!

lyzhenyang
Автор

Damn, it really only took you 20 minutes to explain something that my professor needed two full lectures for. Thank you so much! This was so helpful

nicolaiholtkamp
Автор

best video lectures of rl on the internet

hassaniftikhar
Автор

One of the best series if not he best in describing DRL.
Good Job !!!!

Yahia.N_Ahmed
Автор

Your videos are like espresso, condensed, tasty, full bodied but you should not try to rush when watching them. There are no spare words so when you miss one, you're lost 😀Great video, I love that logical structure, rock solid!

marcin.sobocinski
Автор

In 15:46 you said "if that policy is greedy in respect to thatvalue function" but i don't quite understand what you ment by that. Other than that the video is crystal clear. thank you for these videos.

hypershadow
Автор

This is so well done! Explaining stuff well can be very difficult. Thanks a lot! I'm studying RL at a university course, but this was way more helpful!

vesk
Автор

Excellent video. Even though I have been studying RL for a while, the video clarified some previously learned concepts and gave me a better understanding of the topic.

usonian
Автор

These series of videos are really nice. I would love to see you go more into the theory/proofs of why policy iteration works... as another series. Once again, really good work.

katchdawgs
Автор

It turns out that in fact, algebra *is* fun, cool, and exciting

AcademicDisciplineHD
visit shbcf.ru