filmov
tv
Multi Agent Proximal Policy Optimization
Показать описание
Two Artifically Intelligent agents are driving rackets to play tennis. The agents are using Gaussian Actor Critic Network and were trained used a Multi Agent Proximal Policy Optimization. The Critic input was augmented with the observations and actions taken by all agents from their point of views.