RL Course by David Silver - Lecture 6: Value Function Approximation

Показать описание

#Reinforcement Learning Course by David Silver# Lecture 6: Value Function Approximation

Google DeepMind

Рекомендации по теме

Комментарии

I'm grateful that David was successfully sampled in this iteration of the universe.

TheAIEpiphany

"Life is one big training set"
D. Silver, 2015

Chrnalis

0:00 Motivations
0:35 Outline

0:35 Large-Scale Reinforcement Learning
3:55 Value Function Approximation
8:40 Types of Value Function Approximation
11:55 Which Function Approximator?

15:38 Incremental Methods
15:43 Gradient Descent
17:35 Value Function Approx. by Gradient Descent
21:35 Feature Vectors
23:23 Linear Value Function Approximation
28:43 Table Lookup Features
30:50 Incremental Prediction Algorithms
31:10 Monte-Carlo with Value Function Approximation
37:33 TD Learning with Value Function Approximation
41:56 TD(lambda) with Value Function Approximation

49:00 Control with Value Function Approximation
52:30 Action-Value Function Approximation
53:50 Linear Action-Value Function Approximation
55:20 Incremental Control Algorithms
56:20 Linear Sarsa with Coarse Coding in Mountain Car

1:04:30 Study of lambda: Should we Bootstrap?
1:06:10 Baird's Counterexample
1:06:30 Parameter Divergence in Baird's Counterexample
1:06:50 Convergence of Prediction Algorithms
1:08:00 Gradient Temporal-Difference Learning
1:09:00 Convergence of Control Algorithms

1:10:19 Batch Methods
1:12:30 Batch Reinforcement Learning
1:13:30 Least Square Prediction
1:15:25 Stochastic Gradient Descent with Experience Replay
1:17:25 Experience Replay in Deep Q-Networks (DQN)
1:24:46 DQN in Atari
1:26:00 How much does DQN help?
1:27:35 Linear Least Square Prediction (2)
1:32:29 Convergence of Linear Least Squares Prediction Algorithms
1:32:50 Least Squares Policy Iteration
1:34:15 Chain Walk Example
1:35:00 LSPI in Chain Walk: Action-Value Function

yasseraziz

this guy is definitely the best to explain RL

proximo

It's gonna be pretty confusing the first time you hear about it, but give it a time, try to understand some of it and then come back to watch it again, you will see how much you can comprehend it.

MinhVu-fohd

This video is yet another example that the complexity of a subject really depends on the one who is teaching it. Thank you David, for making RL so much more accessible and understandable! It is a real pleasure listening to those lectures of yours ❤

florentinrieger

This is by far the best course on DRL I have ever watched. I kind of lost on the different notations in different papers although I can code many of them up. But this video summarized the root of the problems and solutions concisely which makes me be able to make sense what I have learned in the past. Thanks for the great work!

junzhu

2020 and still watching this to refresh the knowledge 🙋‍♀️

LunahLiu

David Silver is an intelligent and humble person. He explains things very well, I'm glad that I came across his lectures

shyomd

holy cow, in the last 30 minutes this lecture goes completely off the rails. The amount of concepts introduced and the number of slides shown increases exponentially towards the end :-)

martagutierrez

1:38 Introduction
15:28 Incremental Methods
1:12:17 Batch Methods

NganVu

It's amazing how intuitive these concepts become after watching these lectures. David Silver makes the math and theory behind RL that seems so hard to grasp impossible to forget.

aidankennedy

This is a loooot a of knowledge that probably took over 20 years to get this complicated and genius. The question is why would someone share it for free like this! It's a GIFT!

willlogs

golden information for free. Thanks DeepMind! We need more up to date course though :/

robosergTV

This reminds me of the quote by Lex Fridman that goes, "All machine learning is supervised learning. What differs is how that supervision is done."

existenence

I don't really care about the math in the lecture, I just enjoy the moment listening David talking about state of the art RL and suddenly catch his idea.

ThuongPham-wgbc

Very easy to understand ! outstanding lecture! outstanding teacher!

trantandat

it would be great, if this video had subtitels

TheKovosh

Amazing lecture. Thanks for the upload.

SadmanSakibEnan

16:39
- gradient is the direction of descent !

WahranRai

RL Course by David Silver - Lecture 6: Value Function Approximation

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning

RL Course by David Silver - Lecture 3: Planning by Dynamic Programming

RL Course by David Silver - Lecture 2: Markov Decision Process

RL Course by David Silver - Lecture 4: Model-Free Prediction

RL Course by David Silver - Lecture 5: Model Free Control

RL Course by David Silver - Lecture 6: Value Function Approximation

RL Course by David Silver - Lecture 9: Exploration and Exploitation

RL Course by David Silver - Lecture 8: Integrating Learning and Planning

RL Course by David Silver - Lecture 7: Policy Gradient Methods

RL Course by David Silver - Lecture 10: Classic Games

RL Course by David Silver Lecture 1 Introduction to Reinforcement Learning

RL Course by David Silver Lecture 7 Policy Gradient Methods

RL Course by David Silver Lecture 9 Exploration and Exploitation

RL Course by David Silver Lecture 4 Model Free Prediction

RL Course by David Silver Lecture 2 Markov Decision Process

RL Course by David Silver Lecture 1 Introduction to Reinforcement Learning

RL Course by David Silver Lecture 1 Introduction to Reinforcement Learning

RL Course by David Silver Lecture 5 Model Free Control

RL Course by David Silver - Lecture 10: Classic Games [w/visible slides]

RL Course by David Silver Lecture 2 Markov Decision Process

RL Course by David Silver Lecture 1 Introduction to Reinforcement Learning

【Outline of RL】RL Course by David Silver Lecture 1 Introduction to Reinforcement Learning

RL Course by David Silver Lecture 7 Policy Gradient Methods 1

RL Course by David Silver Lecture 5 Model Free Control