Deep Q Learning Networks

Показать описание

Chief Data Scientist Jon Krohn explores deep reinforcement learning algorithms and demonstrates essential theory of deep reinforcement learning as well as DQNs.

Video course also available in Safari.

Рекомендации по теме

Комментарии

spaceman, teacher, firefighter, postman, and now AI engineer.
you're a real man👏🏻

greatsaid

Hello Jon,
I am not finding the code in your Github.. can you please help me.
I will be thankful

ImtithalSaeed

Never knew Johnny Sins was into coding

thefrozenwaffle

Can’t believe u are such a multi faceted personality - A doctor, plumber, nurse, astronaut, firefighter, corporate guy, everything! And especially a guy who likes AI!

blackmane

When I run your notebook on my machine the training time is so much slower! You are able to do 1000 episodes in <30 seconds in the video but on my own machine it takes more like 45 minutes! Do you know what could cause this massive increase in runtime?

maxbird

excellent walkthrough of DQN theory. clear explanation with hands-on example!

chyldstudios

Everyone wants to be a datascientist even wwe wrestlers smh

DistortedV

I tried to run the code but failed with "ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2, ) + inhomogeneous part." on "state = np.reshape(state, [1, state_size])", may I know why? Thank you very much!

xiaofengliu

54:52, doesn't predicting the values for the other actions and then feeding them into the fit function affect the optimizer?

andretelfer

At 50:50 we model the future reward by adding self.gamma * model prediction. My understanding is that the model predicts an action but not a reward. So how can we add the action the model suggests to the reward? Can you elaborate on this? Thank you very much

DanielWeikert

Thanks a million Sir,
Best tutorial for Deep Reinforcement Learning everrr!!

abdelrahmanshehata

Thank you for this excellent tutorial! Better than several others I watched and didn't find as helpful. The way you built things up theoretically and in code was extremely well organized, well explained, and easy to follow. Much appreciated!

mmartel

I came here to comment something about Johnny Sins but I see people have already did. 😂😂

saifahmadkhan

Correction: 1:08:20 openai-gym works also on windows, at least as of writing.

Phatency

1:04:10 We are still performing gradient descent, rather than gradient ascent, because we're using the Mean Squared Error loss function, between a target q-value prediction, and our current q-value prediction. Therefore the further away we are from this target, the larger the Mean Squared Error, and that distance is what the stochastic gradient descent is minimising.

cernsb