Overview of Deep Reinforcement Learning Methods

Показать описание

This video gives an overview of methods for deep reinforcement learning, including deep Q-learning, actor-critic methods, deep policy networks, and policy gradient optimization algorithms.

This is a lecture in a series on reinforcement learning, following the new Chapter 11 from the 2nd edition of our book "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz

This video was produced at the University of Washington

Steve Brunton

Рекомендации по теме

Комментарии

10:20 I think in this example the state probability density function is assumed stationary for an ergodic environment even in the case of a dynamic policy. So perhaps this assumption implies a static reward function from the given environment, which would not be the case in a dynamic environment like a medical patient whose bodily response to a drug would vary throughout their lifetime/treatment. I checked, Sutton and Barto indeed mention ergodicity of the environment as the reason for policy-independent mu in their book on p.326 and p.333.

MaximYudayev

at 10:05, my understanding is that the fact that we do not derivate that probability comes from a local approximation assumption. So that formula is only approximately true for changes that are not too big. This simplification is one of the most important parts of the policy gradient theorem, and informs the design of "soft" policy-gradient algorithms, in which we do not allow the policy to change too much since our update logic only works for small steps.

matiascova

this is literally the best series for understanding RL ever thank you so much professor for sharing this.

BoltzmannVoid

Thanks professor Steve. Once I hear "welcome back" I just know it's our original professor Steve 😀👍

metluplast

i hope you'll create a series where all of the equations in this series is being applied to pytorch and creating simple projects, that would be awesome.

mawkuri

Excellent video, basically saved my day in trying to wrap my head around all the terms and algo :D
The concepts have been presented with unmatched clarity and conciseness.
Have been waiting for this since your last video on "Q-Learning".

Thank you so much!

gbbhkk

Thanks for the video ! can't wait for that deep MPC video.

Rodrigoviverosa

This is a fantastic tutorial. Thanks for putting in the time and effort to make it so digestible

dmochow

Thank you professor. This has been great to dust off some RL concepts I had forgotten

BlueOwnzU

6:22 But Professor, you know we love math derivations!

FRANKONATOR

@Eigensteve
Amazing video lectures. I had watched several of your series.
Please if possible make a series about Deep MPC, it would be of great value.

wkafa

Steve I follow all of your lectures. Being a mechanical engineer I really got amazed by watching your turbulence lectures. I personally worked with CFD using scientific python and visualization and computation using python and published a couple of research articles. I'm very eager to work under your guidance in the field of CFD and Fluid dynamics using Machine learning specifically simulation and modelling turbulence fluid flow field and explore the mysterious world of turbulence. How should I reach you for further communication?

ramanujanbose

@10.33 Steve, maybe mu sub theta is just a vector of constants for the means associated to the asymptotic distribution of each state s to scale the sum of weighted probabilities across all actions for that state in relation to each state's asymptotic distribution?

ryanmckenna

10:20, I think it's because we usually use PG in infinite state-action pair models. So in other words, mu(s) is untrackable. It's something like the latent space of an auto-encoder where we can't really track it to generate data.

sarvagyagupta

Excuse me professor, I am not sure about this specific case: If we have a DRL architectute that interacts with an ad-hoc model we have built (which presents a given structure as the Markov Decision Process), but the DRL agent does not have any prior information on the mechanics of such model (it can just measure outputs and generate inputs), this would be considered model-free?
Thank you for your amazing work!

joel.r.h

So my strategy for a better explanation would be to do it like Andrej: start of with a toy example on the real algo, also show the python toy code. Explain how it is connected to other models. After that u can start with the math derivation which is mostly interesting only for ML theorists.

randywelt

In DDQN, did you need the Q function (Theta_2) inside the gradient involving d/dTheta?

add-mtxc

Zabardast 🎉. Where can i find code toturials similar?

a_samad

Prof. Brunton, are you using a lightboard for the lectures? Do you have advice on which one to purchase?

add-mtxc

You are great!!!, a really helpful video. But sir, you did not talk about the MDP

MrAsare

Overview of Deep Reinforcement Learning Methods

Overview of Deep Reinforcement Learning Methods

MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)

A friendly introduction to deep reinforcement learning, Q-networks and policy gradients

Deep Reinforcement Learning: Neural Networks for Learning Control Laws

An introduction to Reinforcement Learning

An introduction to Policy Gradient methods - Deep Reinforcement Learning

Introduction to Deep Reinforcement Learning

Reinforcement Learning Basics

UP POLICE CONSTABLE PYQ ❓🔥#uppolice #uppolicemath #mathtricks #mathshortcuts #maths @mathsabhyudaya...

Introduction to Deep Reinforcement Learning | Deep RL Course

Reinforcement Learning Series: Overview of Methods

Introduction to Reinforcement Learning | Scope of Reinforcement Learning by Mahesh Huddar

Deep Learning | What is Deep Learning? | Deep Learning Tutorial For Beginners | 2023 | Simplilearn

MIT 6.S094: Deep Reinforcement Learning

Introduction of Deep Reinforcement Learning

Deep Reinforcement Learning Nanodegree Program

But what is a neural network? | Chapter 1, Deep learning

Introduction to Deep Reinforcement Learning

MIT 6.S191: Reinforcement Learning

MIT 6.S191 (2023): Reinforcement Learning

Deep Reinforcement Learning Tutorial for Python in 20 Minutes

Q-Learning Explained - Reinforcement Learning Tutorial

An Introduction to Deep Reinforcement Learning

Reinforcement Learning: Crash Course AI #9