Reinforcement Learning - My Algorithm vs State of the Art

preview_player
Показать описание
In this 3rd video about inverted pendulum balancing with AI, I compare the results I had with my own algorithm with a state of the art reinforcement learning algorithm using Isaac Lab, which is part of the NVidia omniverse platform.

PC Specs:
i7 -12700K
RTX 4090
32GB memory
Рекомендации по теме
Комментарии
Автор

I think you would be interested in network pruning. This is something that's typically done periodically during training to thin networks. If you examine the weights in your PPO-optimized network, you'll find that many are very small, while others are larger. If some near-zero weights are set to zero, networks will often become more stable after fine-tuning. You'll find that the connections in the network begin to look sparse and very similar to networks generated via. Evolutionary methods. PPO is just an optimizer and will work with whatever network configuration you want. The evolutionary networks shown in the video are all differentiable, so PPO would be able to optimize. That would be an interesting comparison if you'd want to pursue that!

chris-graham
Автор

Thank you so much for this demonstration and adding the links! I didn't know of Isaac Lab and was wondering how it was possible to control the mechanics. Great video!

fluffsquirrel
Автор

The quality and education of these videos is unmatched please keeping making stuff like this!

imanuelbaca
Автор

getting that sort of aid from NVIDIA is super nice. super cool, my school just got an ai accelerator, " AGX Orin" very cool piece of computing and fantastic of AI training and research. also, as someone who is more hardware orientated, it has a super fascinating architecture(shared cpu and gpu global memory!)

Waffle_
Автор

I've been looking for a subject for my engineering degree and this video might be exactly it! Thank you for the inspiration, your videos are always a blast!

kubstoff
Автор

This is just brilliant. I verbally gasped at those numbers. I am so grateful to be living in a world with this sort of stuff, it's truly amazing!

briandeanullery
Автор

Have you considered adding physical parameters from motor torque and motor weight? This would help you get much more realistic sim and difficulty level. Also, realistic response times (based on inference speed + connection latency). Also, you can either have a motor at the base and one at the middle joint or both at the base.

You may also consider adding a battery's weight, so you have the voltage required to power those two motors for some period (say 5 min). This will be an awesome challenge and help you connect simulation to reality much more closely, which sounds super exciting. Looking forward to see if you end up working on it!

hcy
Автор

It's amazing to see it temorarily give up on balancing when it gets too close to the edge of the rail, so it can try again later in a more favorable position

waity
Автор

We got Pezzza's work X Nvidea collab before GTA VI 😭

max_me_is
Автор

17:06 That's so similar to what the timescales of evolution in nature, and a human learning a skill are like. That's kinda crazy. Really makes it look like the algorithms successfully mimic real counterparts.

_nemo
Автор

PPO and gradient-based policy learning in general is amazing. I will still say that your struggle to get an evolutionary algorithm to learn this problem led to some really creative and impressive curriculum learning ideas which also apply to PPO :)

poketopa
Автор

Do note that Evolutionary algorithms are usually better than pure RL agents for problems with very sparse rewards (Which is not the case here). For these problems, a hybrid approach might work best.

sutsuj
Автор

Oh my god, a video from Pezzza!! I'm so excited!!

PatrickHoodDaniel
Автор

Thanks for this high quality video and comparison of those algorithms, very nice. Keep it up

requestfx
Автор

We've come along way in simulation technology

MicahBratt
Автор

I'm solving similiar task: I'm trying to learn AI car to drive, with realistic physics. And I was struggling with learning as you do in previous video, I was inspired by your solution and tried another approach: I started from simple physics (no inertia, no wheels, just rotations + offsets), then gradually interpolated between this simple physics and hard physics. And my NN was able to learn how to drive perfectly. But then I tried energy-based model, basically it's an NN that receives current state, desired action and outputs just a single number - energy. You need to find best action that outputs minimum energy. I iterated over 9 possible actions, and that NN was able to learn how to drive in complex physics without any hacks and very fast.

So, what do I think: first try CMA-ES, as a superior zero-order optimization method. I think that NEAT is a trash, and one day I will test it out. Then you should try energy-based model. Then it will be someway fair comparison. Now it's not fair absolutely, and I slightly disappointed with this video.

optozorax
Автор

your channel is growing awesomely and it s quite deserved i love your work thank you.
Very nice that Nvidia sponsored this video too!

VivienLEGER
Автор

Now you have to add flex to the materials, a small gap to the rollers and the beam. Then add a slack in the bearings...

nexttonic
Автор

Another great video! I'll try a bit of Isaac Sim too... seems pretty cool to play it.

jairoandre-
Автор

Instant like and sub. I could watch these all day. Great work!

kiaranr
visit shbcf.ru