Reinforcement Learning from scratch

Показать описание

How does Reinforcement Learning work? A short cartoon that intuitively explains this amazing machine learning approach, and how it was used in AlphaGo and ChatGPT.
Part 1 of 3.

0:00 - intro
0:13 - pong
0:28 - the policy
0:51 - policy as neural network
1:32 - supervised learning
2:51 - reinforcement learning using policy gradient
4:24 - minimizing error using gradient descent
4:45 - probabilistic policy
5:01 - pong from pixels
6:58 - visualizing learned weights
8:18 - pointer to Karpathy "pong from pixels" blogpost

Graphics in 5 Minutes

Рекомендации по теме

Комментарии

this is video is super underrated. In fact the whole channel is underrated.

darthvader

Too beautiful you can watch this kind of videos all the day without get bored

Arivan_Abdulla

I don't know how I stumbled upon this video but that was very interesting and intuitive to understand. Thank you.

ashketchum

agi: 1. ai develops understanding of win-loss conditions and sets policy params (inputs & actions) accordingly. 2. ai creates (= designs & builds) training env(s). 3. ai iterates, avals & adjusts policy parameters accordingly 4. done (or validation run(s) w/ human(s))

themaxgo

Your Channel IS SO GREAT, I share with all my eng friends for you to get more visibility!

themathguy

This video is amazing. You explained everything in such a simple manner. I am feeling really motivated to learn more about reinforcement learning and neural networks after watching this.

tushargupta

This is really awsome! It's the best video that explains DRL in such an easy to understand way!

metaljacket

I agree once you see how it all works it seems like 1s and zeros give me some feed back on r/grand unified theory or cosmo knowledge

Sumpydumpert

I really like the way you visualize what you are talking about. Thank you for putting in the effort!

CptDoge-rnou

Your videos are great. Looking forward to more!

a.aspden

Great video, very helpful, easy to understand.

marcinstrzesak

Can you playlist each one of your topics plz?
I wanted to post on Twitter(X) your video topics but could only post a single video at a time.

Great content by the way. Ty very much.
Your perspective on some topics helped me a lot to get a more intuitive understanding.

BlueBirdgg

I get how the model can see moves and output up or down action. But I don't get how model tracks the score for rewards etc

Can someone explain how the reward is fed into model

william_