Creating a custom Unity3D Machine Learning Agent

preview_player
Показать описание

-------

Go through the steps required to create a new custom machine learning agent in Unity3D. I'll show you how to set them up for training, how to do the training with tensorflow, and how to get that trained agent up and running in-game.

Рекомендации по теме
Комментарии
Автор

So this was a great video which got me started with a simple example.
Some tips to avoid people going through 2.5 days of head scratching ;-)
- Use unity 2017
- The context has changed with a few commands, examples below:

List<float> state = new List<float>();
public override void CollectObservations ()
{
//state.Add (currentNumber);
//state.Add (targetNumber);
AddVectorObs (currentNumber);
AddVectorObs (targetNumber);
//return state;
}
public override void AgentAction (float[] vectorAction, string textAction)
{
code here for agent behaviour
}
You need tensorflow 1.4 for correct builds in Unity not 1.8

Outside of that great tut, thanks !

androvisuals
Автор

Thanks for the video! A couple notes though.

First, the reason for the high immediate reported success rate (rather than a slow climb) is that the only way for the agent to fail is to go too far in the wrong direction. The cube could take tens of thousands of steps to reach the target and as long as it gets there eventually without going out of bounds, that's a success. Not a big problem for a simple example like this, but in a more complex case, that could result in very long training sessions where each step takes a potentially infinite amount of time to complete. To limit the amount of time per iteration of training and get more meaningful feedback, enter a value into the "Max Step" field on the agent (300 seems to work well in this case, as the minimum it needs to complete this scenario every time is 100). Leaving a zero in this field gives the agent infinite time to complete the task.

Second, the reason for the jitter is that the cube has no reason *not* to jitter. Completing the task in 10, 000 steps yields the same reward as completing the task in 100 steps. The simplest way to remove the jitter is to add a step penalty so that moving directly towards the target yields a greater reward than wasting time going the wrong way. When you check if the agent has reached its target, add an "else reward = -.01f".

The danger with this modification is that if the agent were to start close to a boundary, it might end up learning to move directly out of bounds and lose immediately rather than risk losing AND spending time trying to win...that won't be an issue in this case because the target is always closer than the boundary, just something to be aware of if one wanted to extend this example.

willpetillo
Автор

Been waiting for tutorials since the first day Unity releases that ML video :)
Much thanks.

nathanhung
Автор

I'm learning this now and this was exactly what I needed! very helpful

complexobjects
Автор

please more ! thanks to this series of your videos i'm now trying to make a brain which can shoot (catapult) an object on a target. first just distance (2d) and then i'll try at 360° around the shooter position !

Raducki
Автор

You saved my life, this tutorial is so useful!

danlan
Автор

great job with this tutorial, very helpful. keep going

oakisland
Автор

tanks man . i can't waiting for advance tutorial

Mohammad-kxrk
Автор

will you submit more advanced tutorials soon? thanks for your efforts.

mavisakal
Автор

is it possible to use this system where players play the game with their character and use the data as agents training sessions ? So the AI controller will learn from player's play style etc. ? And will it make very perfect AI like very precisely made code and logic based behavior as written by coder or it will be more like eccentric because it will be trained by many people from the world.

boroborable
Автор

Thanks! BTW Brain.DefaultAction has changed to 0, so if you're following this tut you may want to set it to -1

abclef
Автор

Very nice tutorial, really appreciated! However, I didn't manage to find out what do all those hyperparameters do as well...yet.
Did you find out a good hyperparameters configuration to increase the Mean Reward?
Keep up the good work, I'll wait for your next tutorial! :)
Best

nicolagarau
Автор

Hello I watched the video Thank you. It has helped a lot. I have one question. I want to train after Android build, what should I do?

장상욱-cq
Автор

@Unity3d College I've been working on a custom learning agent but it seems as though at one point the learning starts to digress and the agent "forgets" how to get a positive score. Have you (or anyone else reading this) run into this issue?

NofarStudio
Автор

PPO.ipnb is not longer in GitHub. Unfortunately, None of these ML tutorials are usable.

bradcarvey
Автор

Good Job Keep it up!!! #notificationsquad

AceLikesGhosts
Автор

Is there a link to download the jupyter notebbok you wrote?

danlan
Автор

Good, learning, nice.
BUT, tensor flow Dll increase 42MB size on compile....

HelyRojas
Автор

Hello,
I am getting this error
"TFException: NodeDef mentions attr 'output_dtype' not in Op<name=Multinomial; signature=logits:T, num_samples:int32 -> output:int64;" I am using TensorFlow 1.5.0. Any ideas?

kristijonaszykas
Автор

Hey There!

Unfortunately, when I run jupyter to load environment I receive this error message:
UnityEnvironmentException: The Unity environment took too long to respond. Make sure numberdemo does not need user interaction to launch and that the Academy and the external Brain(s) are attached to objects in the Scene

Can you help me out?

tranduy
welcome to shbcf.ru