Python Reinforcement Learning using Gymnasium – Full Course

preview_player
Показать описание
Learn the basics of reinforcement learning and how to implement it using Gymnasium (previously called OpenAI Gym). Gymnasium is an open source Python library maintained by the Farama Foundation that provides a collection of pre-built environments for reinforcement learning agents. It provides a standard API to communicate between learning algorithms and environments, as well as a standard set of environments compliant with that API.

Reinforcement learning is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward.

✏️ Course developed by @EverythingTechWithMustafa

⭐️ Contents ⭐️
⌨️ (0:00:00) Introduction
⌨️ (0:04:19) Reinforcement Learning Basics (Agent and Environment)
⌨️ (0:12:15) Introduction to Gymnasium
⌨️ (0:14:59) Blackjack Rules and Implementation in Gymnasium
⌨️ (0:18:27) Solving Blackjack
⌨️ (0:19:46) Install and Import Libraries
⌨️ (0:23:19) Observing the Environment
⌨️ (0:27:55) Executing an Action in the Environment
⌨️ (0:33:01) Understand and Implement Epsilon-greedy Strategy to Solve Blackjack
⌨️ (0:42:28) Understand the Q-values
⌨️ (0:47:29) Training the Agent to Play Blackjack
⌨️ (0:57:10) Visualize the Training of Agent Playing Blackjack
⌨️ (1:04:34) Summary of Solving Blackjack
⌨️ (1:09:57) Solving Cartpole Using Deep-Q-Networks(DQN)
⌨️ (2:29:29) Summary of Solving Cartpole
⌨️ (2:34:07) Advanced Topics and Introduction to Multi-Agent Reinforcement Learning using Pettingzoo

Correction:
00:09 Gymnasium is maintained by the Farama Foundation and is not associated with OpenAI.

🎉 Thanks to our Champion and Sponsor supporters:
👾 Nattira Maneerat
👾 Heather Wcislo
👾 Serhiy Kalinets
👾 Erdeniz Unvan
👾 Justin Hual
👾 Agustín Kussrow
👾 Otis Morgan

--

Рекомендации по теме
Комментарии
Автор

I hope you all liked this course. Make sure to leave your feedback

EverythingTechWithMustafa
Автор

Hey, I’m the maintainer of Gymnasium. It’s not affiliated with OpenAI in any way (though Gym used to be), its part of the Farama Foundation. Thats why the repo is under the Farama Foundation organization, as is the website. The home page of the Gymnasium website and the readmes of Gym and Gymnasium both make this clear.

jordanterry
Автор

Some typos to help get some people through:

1) 21:21- collection should be collections, error raised will say that it doesn't exist otherwise
2) 22:30 - patch should be capitalised, i.e 'Patch'
3) 24:46 - true should be capitalised, i.e 'True'
4) 30:20 - I recommend on line 4 including env.reset() to reset the episode already
5) 35:15 - Just to note, these there are TWO underscores on either side of init, not just one: i.e. __init__ NOT _init_. Can be difficult to see it :)
6) 35:25 - colons were meant to be underscores I am fairly sure i.e. change learning:rate to learning_rate.
7) 36:45 - typo: env.action_dpace.n to env.action_space.n
8) 40:20 - typo: selg should be self
9) 40:32 - typo: tupe should be tuple(fixed later on anyway :D )
10) 44:48 - needs colon after closing bracket for update on line 47
11) 45:54 - typo: actions should be action
12) 47.24- typo: missing self statements for class causing problems, I think line 41 should actually read: self.epsilon = max(self.final_epsilon, self.epsilon - self.epsilon_decay)
13) 47:54 - typo, should be learning_rate = 0.01 NOT learning_rate:0.01
14) 50:53 - Ok, wow, this one took some time to get to a place where it wasn't going to give me a syntax error. For some reason, the instantiation of the wrapper was just not working with any variation of this code. So, to solve this, I first ensured I had imported deque and also imported RecordEpisodeStatistics as shown below:

from collections import deque
from gym.wrappers import RecordEpisodeStatistics

and then write the following line instead of line 2 of this cell:
env = gym.wrappers.RecordEpisodeStatistics(env, deque_size=n_episodes)

(Again, sorry for any mistakes of my own, btw he does fix that before he then runs it at 55:42)


15) 53:05 - Typo: ob should be changed to obs

I will be honest, the content within the tutorial is very good and what it can be used for is definitely something that people should look into. However, the problem is the quality of the teaching process provided. Within the first hour I found the mistakes listed above and it definitely makes it a tedious process not being able to verify the typos to be right or wrong as the code cells were never ran until around 55 minutes in. It is good to have the notebook available in the form of a Google Colab notebook but if the process of understanding the material is not something that was easily cleared up along the way it definitely makes it harder to be sure the code will work the way that it is expected to.

I appreciate you taking the time to make the tutorial Mustafa and wish you the best in the future courses you provide :)

kingvolpes
Автор

Wow, a bait and switch. When I saw the intro at the beginning, I thought "oh great, a clear accent, nice microphone, high production quality, excellent!" and then it switches to, well, you know, something a bit more typical for YouTube.

RyanMartinRAM
Автор

The google colab link is not working can you please make it available. Alot of doubts

zkpxxok
Автор

⭐ Contents ⭐
⌨ (0:00:00) Introduction
⌨ (0:04:19) Reinforcement Learning Basics (Agent and Environment)
⌨ (0:12:15) Introduction to OpenAI Gymnasium
⌨ (0:14:59) Blackjack Rules and Implementation in Gymnasium
⌨ (0:18:27) Solving Blackjack
⌨ (0:19:46) Install and Import Libraries
⌨ (0:23:19) Observing the Environment
⌨ (0:27:55) Executing an Action in the Environment
⌨ (0:33:01) Understand and Implement Epsilon-greedy Strategy to Solve Blackjack
⌨ (0:42:28) Understand the Q-values
⌨ (0:47:29) Training the Agent to Play Blackjack
⌨ (0:57:10) Visualize the Training of Agent Playing Blackjack
⌨ (1:04:34) Summary of Solving Blackjack
⌨ (1:09:57) Solving Cartpole Using Deep-Q-Networks(DQN)
⌨ (2:29:29) Summary of Solving Cartpole
⌨ (2:34:07) Advanced Topics and Introduction to Multi-Agent Reinforcement Learning using Pettingzoo

Fetrah
Автор

I'm trying to get to the code but the link you shared isn't working "Sorry, the file you have requested does not exist". So please share us the right link to access the code!!!!

adamharb
Автор

Just what I needed at the right moment! Thanks!

swastikgorai
Автор

anyone pls send me the fixed link of this notebook? :(
Im currently studying with this video but lacking of notebook makes me suffered

snowyfield
Автор

Where is the notebook for the Cart Pole example?

ECE_DevVidit
Автор

The google colab link is not working can you please make it available

seesea
Автор

Does anyone have the source code in the video? The Colab link in the video introduction is not working. thank you.

jruenvo
Автор

Thank you for the tutorial and the code! But I think you should focus more on "explaining" the mechanism of the code rather than just reading and typing word by word. I actually learned by reading the colab notebook, cannot finish your video. Anyway, thank you very much!

queenslaands
Автор

Book recommendation: "A Primer to the 42 Most commonly used Machine Learning Algorithms (With Code Samples)."

curiousphilosopher
Автор

Thank youuuu. can you explain RL coding in high altitude platforms applications?

AIdreamer_AIdreamer
Автор

Woah, I received an assignment on Deep Q-Networks for reinforcement learning which I have to submit by the weekend and here this is 😮😂

deekshantwadhwa
Автор

I am getting error when I use OBS: tuple [int, int, bool] as function parameters why can anybody explain.

ishaquenizamani
Автор

some constructive feedback: I noticed a lot of typos in the code. I suggest running the code instead of jumping to the next section. Showing errors helps us understand as well. Also you are a bit hard to understand.

nikolaaswillaert
Автор

why are there so many typos. Thanks to chatgpt, these typos can be fixed.

ayushshaw
Автор

One of the worst coding guide i've ever seen. Such a shame because i really want to learn this topic and there are so few guides out there. Oh well, at least i can copy the code and analyze it

Samuelsward