ROS Developers LIVE-Class #21: A Basic Example of OpenAI with ROS

preview_player
Показать описание
In this ROS LIVE-Class we're going to learn how to create our own Qlearning training for a cart pole in Gazebo using both OpenAI and ROS.

We will see:
▸ How to create a Python training program that uses OpenAI infrastructure
▸ How to create the environment that allows to get the observations and take the actions on a Gazebo simulation
▸ How to interface everything with ROS Every Tuesday at 18:00 CET / CEST.

This is a LIVE Class on how to develop with ROS. In Live Classes you practice with me at the same time that I explain, with the provided free ROS material.

IMPORTANT: Remember to be on time for the class because at the beginning of the class we will share the code with the attendants.
IMPORTANT 2: in order to start practicing quickly, we are using the ROS Development Studio for doing the practice.

// RELATED LINKS
Рекомендации по теме
Комментарии
Автор

What is the purpose of get_clock_time() in environment file. Thank you!

ledantm
Автор

I thought the class is on Wednesday! It says in the description "Every Wednesday at 18:00 CET/CEST"

realjsk
Автор

Can you publish the updated link for the RDS project?

YossiOvcharik
Автор

Hi,
I ran 5000 episodes and solved it.

# I think there are 3 main problems:
1. The position control is too weak and laggy. I changed it to effort(force) control and use much larger value then it works much better.
2. The learning rate is also too small(0.5 ~ 0.9 is reasonable value)
3. The running step is too large and the env timestep_limit is too small so the Q matrix is not large enough to learn.
Here is the rewards v.s. episodes plot for 5000 episodes, you can see the smoothed value is increasing slowly which means it's learning.

# Why it's learning so slow?
1. I used alpha = 0.5. It might be faster if I use larger value.
2. The gazebo simulation environment is much more complicated than the original OpenAI gym and it also takes sensor noise into consideration.

# Things are really amazing...
1. It learns to move in the opposite direction when the pole falls down by itself. (I use the same reward for both directions)
2. It learns how to "save" the falling even in the critical situation

NightmareNadir