Machine Learning Summit: Successfully Use Deep Reinforcement Learning in Testing and NPC Development

preview_player
Показать описание
In this 2020 GDC Virtual Talk, Unity's Jeffrey Shih explains how deep reinforcement and imitation learning can be used for scaling playtesting and NPC creation in games.

GDC talks cover a range of developmental topics including game design, programming, audio, visual arts, business management, production, online games, and much more. We post a fresh GDC video every day. Subscribe to the channel to stay on top of regular updates, and check out GDC Vault for thousands of more in-depth talks from our archives.
Рекомендации по теме
Комментарии
Автор

Good talk. That stingy boss is cool. I would really love to see some tutorials on how to do these kind of crazy things.

Hadrhune
Автор

Interesting game "source of madness", I'll check it : )

freemind.d
Автор

I'm glad this talk addresses the issue of computation cost, which is basically the main reason why it's a bit difficult to put drl in application.
I guess you could use things like SAC so that you don't have to throw away your trajectory every epoch, but then again, the implementation might be a bit difficult, considering that a lot of baseline rl tests are PPO, and implementation of SAC sort of requires an understanding of maximum entropy and its tradeoff between minimizing the entropy of the action distribution emitted by the policy + a lot of training time.

But even if you can overcome the cost via imitation learning, 1. how are you going to get the large amount of sample trajectories in the first place and 2. how do you overcome the unpredictability of the policy given patterns of situations or environments unseen during the training process? I heard that a lot of control engineers prefer using analytical control like LQR because of this reason, and having unpredictability in your NPC AI would just look like "bad AI" to average gamers (case study: Total War: Rome II, which if I am correct, used a form of Q-learning (?)). Gamers trying to exploit game mechanics would probably rather see a predictable failure in AI rather than something unpredictable that ends up in the agent aimlessly wandering around the place, which is definitely one of the things that would happen in unexpected situations.

johnlime
Автор

towards the end of the sentence its like the guy runs out of breath so cant hear much.

BossKillRatio
Автор

Also worked on this topic, it's pretty dry and inefficient approach

teckyify
Автор

very boring, no real world examples of success

jonabirdd
welcome to shbcf.ru