Watch this A.I. learn to fly like Ironman

preview_player
Показать описание
...So reinforcement learning is kinda like telling the neural network: "look, I don’t know how to do the thing, but you try do the thing, and if you succeed i’ll give you a reward of 5 dollars." So basically like a father who failed at life and pushes his kid way too hard in an attempt to live out his dreams through his child… That got depressing.

Some music by @LAKEYINSPIRED
Рекомендации по теме
Комментарии
Автор

Seems like adding a facing reward would help stabilize the rotation.

manselreed
Автор

you should also have a negative reward for high angular velocities, that way it has a reason to be more still

redpug
Автор

Theory on why it flies so slow:
Its original training was based on hovering around one point, thus when it gets a new destination, it still assumes that it should arrive there without momentum to better stay at that spot.
Then it got a little training with randomly moving spots, having momentum there is bad too, since its actually way more probable that you need to turn around than that you need to continue going.
This, along with little time based punishment, results in a slower ar

jaceg
Автор

give it access to the next 2 points so it can find a vector between them, also give it incentive to be faster

danieltoomey
Автор

To combat the agent being slow and rotating you could add 2 other negative point rewards, every full rotation can deduct points which would likely reduce the spinning to a minimum and then also give it say 30 seconds to complete a course but deduct points for each second spent too, the agent might learn that the quicker it goes the less points he get deducted. I think revisiting this with these 2 additional criteria would be pretty interesting

p.
Автор

Can we just admire how a few years ago AI struggled to play a 2D game and now this. It's really remarkable.

flyinggoatman
Автор

I think part of the reason it has such a hard time is because it doesn't quite have the detailed control vectors that Iron Man does. If you watch the hovering and flight scenes in the first movie, you'll see he has little compressed air nozzles, jet redirectors, and control surfaces on the boots to help stabilize. He also obviously has flaps on his back, and in later iterations of the armor he has backpack-style thrusters so his COG can be below the thrust point. If the game simulates air drag then add the flaps and stuff, too, but the minimum I think you need to add are the micro thrusters, back jets, and elbow/knee joints.

Drunken_Hamster
Автор

This is my first time viewing your work, and I'm struck both by how incredibly cool this is and your f'ing hilarious sense of humor.
I'm always the last to know, I guess. Really fantastic work, fam.

grimcity
Автор

You should’ve added more or less points depending on how much time they to get to the target, that’s what would fix the flight.

michaeln
Автор

Your reward function could be modified to get what you want. Add in score for time, add in penalty for excessive rotations/spinning

MrAmalasan
Автор

I'd love to see you tackle AI in a preexisting game. I dunno, throw half life at it and see what sticks.

bumpybumpybumpybumpy
Автор

it must have control over the propulsion force

SUED
Автор

Another banger. Always love the way you use memes to make it funny!

reendevelops
Автор

Another thing that you could add to this would be random perturbations like throwing blocks at the agents so that they learn to recover from instability like the drone had at the end. Would you be willing to release the source files for the project and then do a compilation of different people's attempts at improving the result? I think the learning the actual Iron man style of flying might be possible but if you don't want to do all the work on that it could be fun to see what the community comes up with.

CharthuliusWheezer
Автор

seems like the rotation is so that it can use centrifugal force as a stabilization method

fodderfella
Автор

glad your brain cells and hairs grew back :)

raeraeraeraeraerae
Автор

I would make the reward relative to the forward direction to each node to promote a flying posture and stop the spinning. If you added the next node as input as well it might be a bit better at handling its own momentum out of each node.

ArtamisBot
Автор

bro did this without even activating windows what a legend

NotKotten
Автор

LOL every time I watch your videos I laugh at the editing. Excellent.

McShavey
Автор

Let it know where the next goal point is going to be after the one it's currently at disappears and add a reward for getting to the next goal faster.
That way it'll learn to keep the momentum between goals instead of learning to slow down before hitting goals so it doesn't overshoot them and get punished.

Flippin_Tables_Like_Jesus