Reinforcement Learning from scratch

preview_player
Показать описание
How does Reinforcement Learning work? A short cartoon that intuitively explains this amazing machine learning approach, and how it was used in AlphaGo and ChatGPT.
Part 1 of 3.

0:00 - intro
0:13 - pong
0:28 - the policy
0:51 - policy as neural network
1:32 - supervised learning
2:51 - reinforcement learning using policy gradient
4:24 - minimizing error using gradient descent
4:45 - probabilistic policy
5:01 - pong from pixels
6:58 - visualizing learned weights
8:18 - pointer to Karpathy "pong from pixels" blogpost
Рекомендации по теме
Комментарии
Автор

this is video is super underrated. In fact the whole channel is underrated.

darthvader
Автор

Too beautiful you can watch this kind of videos all the day without get bored

Arivan_Abdulla
Автор

I don't know how I stumbled upon this video but that was very interesting and intuitive to understand. Thank you.

ashketchum
Автор

agi: 1. ai develops understanding of win-loss conditions and sets policy params (inputs & actions) accordingly. 2. ai creates (= designs & builds) training env(s). 3. ai iterates, avals & adjusts policy parameters accordingly 4. done (or validation run(s) w/ human(s))

themaxgo
Автор

Your Channel IS SO GREAT, I share with all my eng friends for you to get more visibility!

themathguy
Автор

This video is amazing. You explained everything in such a simple manner. I am feeling really motivated to learn more about reinforcement learning and neural networks after watching this.

tushargupta
Автор

This is really awsome! It's the best video that explains DRL in such an easy to understand way!

metaljacket
Автор

I agree once you see how it all works it seems like 1s and zeros give me some feed back on r/grand unified theory or cosmo knowledge

Sumpydumpert
Автор

I really like the way you visualize what you are talking about. Thank you for putting in the effort!

CptDoge-rnou
Автор

Your videos are great. Looking forward to more!

a.aspden
Автор

Great video, very helpful, easy to understand.

marcinstrzesak
Автор

Can you playlist each one of your topics plz?
I wanted to post on Twitter(X) your video topics but could only post a single video at a time.

Great content by the way. Ty very much.
Your perspective on some topics helped me a lot to get a more intuitive understanding.

BlueBirdgg
Автор

I get how the model can see moves and output up or down action. But I don't get how model tracks the score for rewards etc

Can someone explain how the reward is fed into model

william_
Автор

What is your reward function for the pong game? I did a similar pong game and I couldn't get it to learn.

edvinbeqari
Автор

Simple Reinforcement learning is extremely dangerous in certain nonstationary environments 😅

axe
Автор

but by what number do you change the weights like you never told us

mineq
Автор

how many layers should such network have

bombur
Автор

Imagine using reinforcement learning in quantitative finance 😊

herikaniugu
Автор

Can you share the source code for this project

FRANKONATOR
Автор

ah yes, reinforcement learning. a fundamental computer graphics technology

macratak