The Full Reinforcement Learning Iceberg

preview_player
Показать описание
Dive into 10 levels of the RL stack with Joseph Suarez, a newly minted MIT PhD and the creator of Neural MMO + PufferLib. There's something here for beginners and world-class experts alike. Star the project on GitHub to feed the puffer!

Most of my development is livestreamed right here. It's all open source, and we welcome contributions!
Рекомендации по теме
Комментарии
Автор

Thanks for putting in the work to build a solid foundation for the future researchers. I hope you become a standard and get rewarded for your contributions

DavidMisc-stuff
Автор

wow this was actually fantastic! very well explained the landscape. even has procgen! incredible. Going to check this lib out - thank you

AxelAhmer
Автор

Great video! Love learning abt RL. Subscribed :)

sinfinite
Автор

incredible video as always, i've been putting off starting my RL journey and i think that thanks to this video i'm starting lol

snats_xyz
Автор

Nice video! I also checked out your article on twitter, was a bit hard to find so you should also link it in the description.

mgostIH
Автор

Thanks for the video dude! Keep it up!

krankvegann
Автор

This is a very digestible summary of what you've been working on. I generally operate in the generative NLP space, whose intersection with reinforcement learning tends to be a quick REINFORCE or PPO run to adapt to human preferences, but this makes me wonder if it might not be a bad idea to take a few more steps into RL.

With regards to open endedness (creativity, if you will), it seems to me quite a natural assumption that some integration with language models with a combination of explicit and implicit planning, as well as specific rules regarding the communication of agents, would be the way forward (I do regard the segregation of information between agents to be quite an important consideration), though I dare not suggest exactly what form that will take for fear of being just close enough to be frustrated at not guessing at it, and yet being just far enough to be laughed at for it.

novantha
Автор

You’re so cool bro it’s actually incredible and you are working out my dream in real time

Sykooma
Автор

It is hard to gain trust as a dev when you are wearing elegant tuxedos instead of a coffee stained white t-shirts!Jokes aside great video!

avoidthevoid
Автор

Im glad im not the only one building custom simulators

AIShipped
Автор

But my Dr. says all my issues are because of carbs, I am confused

Wicaeed
Автор

Amazing video, people underestimate these points. I am here for level 10 though ;)

umairnasir
Автор

this is awesome, how approachable do you think using puffer lib is for beginners in RL? For context, I've trained a few RL agents using Gym in the past.

kushaagra
Автор

Great video! Some finance env please ;)

philiplivdan
Автор

cool video, "ppo solves dota, it can probably solve your problem too" is pithy, I like it

AmbisinisterSSBM
Автор

regarding open-endedness... Minecraft, Web-Agents, ...?

-mwolf