Multi Agent Proximal Policy Optimization

preview_player
Показать описание
Two Artifically Intelligent agents are driving rackets to play tennis. The agents are using Gaussian Actor Critic Network and were trained used a Multi Agent Proximal Policy Optimization. The Critic input was augmented with the observations and actions taken by all agents from their point of views.
Рекомендации по теме