What is Q-Learning (back to basics)

Показать описание

#qlearning #qstar #rlhf

What is Q-Learning and how does it work? A brief tour through the background of Q-Learning, Markov Decision Processes, Deep Q-Networks, and other basics necessary to understand Q* ;)

OUTLINE:
0:00 - Introduction
2:00 - Reinforcement Learning
7:00 - Q-Functions
19:00 - The Bellman Equation
26:00 - How to learn the Q-Function?
38:00 - Deep Q-Learning
42:30 - Summary

Links:

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Рекомендации по теме

Комментарии

this is how you leverage the hype like a true gentleman 😎

raunaquepatra

I have no time for the hype, but I have all the time in the world for a classic Yannic Kilcher paper explanation video

EdFormer

Thanks for such a solid fundamental introduction to Q-learning especially in a time many are really excited about Q-star, but few seem to try understanding its basic principles.

changtimwu

Thank you Yannic your style of surfing the hype is the best!!!

Alilinpow

Love these paper videos, the reason I subscribed to the channel :)

qwertywifi

This was very informative. Thank you so much for sharing.

K.F-R

I would be very interested in seeing a series of paper/concept reviews such as this focusing on the state of the art in RL

OrdniformicRhetoric

thanks for posting this; good to see some real content

travisporco

Another awesome video from you Yannic! Gold material on this channel.

agenticmark

I need someone to upload the Q function to my brain so my life choices start making sense

ProblematicBitch

This is great. You’re a true wizard in explaining Q, and I love the anonymous look with the sunglasses. You’re a regular Q-anon shaman.

nickd

I will make sure to stay hydrated, thank you

drdca

I did deep q learning for my cs bachelors thesis way back. Thank you so much for reminding me of those memories.

ceezar

18:00 In chess terms, 'Reason 1' can be likened to: 1) Choosing a1 means you won't capture any of your opponent's pieces. 2) Opting for a2 allows you to swiftly capture a substantial piece.

vorushin

By far the most effective way of learning. Hacking at the essence, in a chain of thought manner.

matskjr

A good example for what you were talking about just before the bellman eq, would be that Move B(10 reward) will help take a chess piece in the future. Where as Move A, will result in moving away from that reality, or even maybe having the piece be taken by the opponent, making the 'next move' the 'policy' would want, not be possible.

draken

I realize that I read this paper ten years ago. Now I'm ten years older omg.

tglvhvr

Old paper review - yeh! we missed that.

drhilm

My dude, that point you mention at 45:05, right at the end, about having state and actions being the input is exactly the question I've been trying to find an answer to. To see and hear it mentioned twice but each time you said you're not going to talk about it felt like knife in heart. If you don't do a video on it, do you have papers that talk through how this has been done? Great stuff either way, able to learn a bunch.

jackschultz

During your explanation it comes to my mind the Dijkstra's Algorithm. They say that this Q* can increase the processing needs some 1000 times. You check all the paths in your graph and choose the ideal one.

EnricoGolfettoMasella

What is Q-Learning (back to basics)

What is Q-Learning (back to basics)

Q-Learning Explained - A Reinforcement Learning Technique

Q-Learning Explained - Reinforcement Learning Tutorial

What is Q* | Reinforcement learning 101 & Hypothesis

Foundation of Q-learning | Temporal Difference Learning explained!

Reinforcement Learning: Crash Course AI #9

#1. Q Learning Algorithm Solved Example | Reinforcement Learning | Machine Learning by Mahesh Huddar

Deep Q-Learning - Combining Neural Networks and Reinforcement Learning

BACK TO HOMESCHOOL Q & A [How to Start Homeschooling] And Other Questions

An introduction to Reinforcement Learning

Reinforcement Learning Basics

Q Learning Algorithm | Reinforcement learning | Machine Learning by Dr. Mahesh Huddar

Foundations of Q-Learning

Reinforcement Learning, by the Book

Deep Reinforcement Learning: Neural Networks for Learning Control Laws

Q Learning In Reinforcement Learning | Q Learning Example | Machine Learning Tutorial | Simplilearn

MIT 6.S191 (2023): Reinforcement Learning

Q Learning Intro/Table - Reinforcement Learning p.1

Introduction to Reinforcement Learning | Scope of Reinforcement Learning by Mahesh Huddar

Deep Learning Cars

A friendly introduction to deep reinforcement learning, Q-networks and policy gradients

Q Learning Algorithm and Agent - Reinforcement Learning p.2

Q Learning Explained | Reinforcement Learning Using Python | Q Learning in AI | Edureka

Reinforcement Learning Made Simple - Reward