Simulating the World To Train AI

preview_player
Показать описание
Recently, Nvidia announced a synthetic data engine for training artificial intelligence.

Synthetic, meaning that researchers can use it to generate fake images of the real world for training self-driving AIs.

The concept of leveraging modern open-world video games and their engines to create huge amounts of fake data for your AI might seem like a disaster in the making.

But recent events seem to imply the otherwise. Using computers to generate computer data so that we can train other computers is a trend that is working.

In this video, we are going to talk about training autonomous driving AIs with data from a synthetic world.

Links:
Рекомендации по теме
Комментарии
Автор

The research published by NVIDIA is absolutely crazy. I remember when they developed a GAN that could generate photorealistic people who don't actually exist. Crazy stuff. This is just further proof that whatever they're doing there is quite insane.

David.Marquez
Автор

Hey long time fan of your channel. You touched on my field of interest so I thought to leave some notes.
Notes from a Master Student AI:
1:22 You mention Semantic labeling, but don't actually show any semantic segmentation, which is the semantic labeling of each pixel. It is slightly different from instance segmentation (on the right) because with semantic segmentation you do actually need to assign a value to every pixel, not just the objects within the image. The most complete form is panoptic segmentation, which is both instances and classes annotated at every pixel.
1:47 Brand new dataset is not actually the standard, what commonly is tested is the generalization from one subset of a dataset (~80%) to another subset of the dataset (~20%). A brand new dataset falls under Out of Distribution (OOD) attacks and under the general term "robustness".
3:22 AlphaZero is a warped example of the concept you're trying to explain. Because AlphaZero has a perfectly working simulation to work with, and its predecessor also did a lot of self play. The improvement was in removing the need to pretrain (or prime) the model on humanplay. This was an advancement in reinforcement learning, not the simulation to reality gap.
4:59 You miss the full scope of difference in scale, the difference between 3 and 96 doesn't seem large (factor of 32) but what is actually significant is the amount parameters. That difference is between ~10, 000 parameters for ALVINN and 170, 000, 000, 000 (170 billion) Parameters for GPT-3 a factor of 17 million. Also check out DALL-E for the new multi-modal language and image comprehension and GATO for multimodel behavior learning.
5:20 A KMP 1500 robot isn't deployed with any machine learning behavior. It's all old-fashioned expert-made modelling. A better illustration would be robot arms grabbing complex and/or morphable objects.
6:11 Actually the problem isn't that 5000 isn't enough, it's an academic benchmark dataset. Big companies can get 500, 000 images if they try a bit (probably even more). The problem is that is also not enough for self-driving cars. And it's not because 500.000 isn't enough to detect trees or pedestrians in the city on a clear day, but there needs to be an insane amount of diversity to detect trees, roads under any weather condition and circumstance. Not to mention 1 in a billion or 1 in a trillion edge cases.

For the rest, the carla stuff is great!

PS: for another mind-blow look up: "Neural Radiance Fields" and "instant NERFS".

hidde
Автор

Is it really an accurate simulation if BMW drivers use their turn signals in them?

izzieb
Автор

The obvious problem seems overfitting on a massive amount of data that is inevitably biased (or outright flawed) compared to real-world information, because the "space of possibility" represented by the dataset (no matter how large it is) covers at maximum a simplified interpretation of the world as envisioned by the particular developers of the simulation. Go doesn't have this problem because its "space of possibility" is already exhaustively defined by the game's rules.
It might be useful in some cases, but I remain skeptical of its use for AI meant to be used in a potentially lethal enivronment like car traffic (or cancer detection, for that matter).
What if the AI turns out significantly less successful at detecting, say, a pedestrian who isn't wearing (something similar to) one of the 12 sets of clothing implemented into the simulation?

buzhichun
Автор

According to some theories, this is exactly what we do when we dream, minus the "emotions" part - our subconscious mind generates various scenarios, with some elements from the reality, to challenge other parts of the mind on how to respond to them.

claudiopiccoliromera
Автор

Two problems jumped out. First, substituting an artificially created world instead of the real thing runs the very real possibility of leaving some "events" out of the training dataset. This creates a blind spot in the trained model. The second is that synthetic data may create "events" that don't really exist in the "real world". This wastes training cycles and increases the size of the trained intelligent agent since it's trained for things that don't really happen. Great video!

lakeguy
Автор

I know they used only the images but training an self AI with gta is glorious. What could go wro... WASTED !

Monsterpala
Автор

So some AI are literally learning their driving skills from GTA V...

elibar
Автор

Perfectly fine. We typically expand our dataset via bootstrapping and this simulation is its equivalent.

sweealamak
Автор

There are a lot of approaches to do this kind of synthetic simulation, and you covered some of them here in your video, but there are other ways of generating synthetic data that is missing here. Waymo's internal simulation engine takes a somewhat different approach, they try to synthesize new sensor returns directly from real world sensor data. Instead of using a video game engine they might use an NN or a mesh reconstructor to directly interpolate or re-raycast sensor returns. Given that almost all autonomous car companies have reams and reams of sensor data this can be a more scalable solution than hand-authored or even procedurally generated 3D assets in producing photorealistic environments.

Another tricky aspect that I think is less covered in this video is how to generate realistic agent reactions within these worlds whether hand authored or synthesized. One approach is to write basically video game AIs for other agents. It works but has drawbacks like complexity (at some point you're literally just writing another self driving car), and validation (unlike video game AI you'd want some guarantees on realism in behavior, which is hard to measure). Other ways would be to basically train another NN to generate these outcomes, whether a GAN or some other net, you don't need to worry about writing a second codebase but you inherit all the problems with black box NNs, and this time doubly problematic because this is what you're using to train and validate your main software.

ericbday
Автор

I’d think that the biggest benefit of synthetic training environments is that the software can automatically label absolutely *everything*. So no need for humans yo laboriously label scene elements. *Huge* reduction of labor at the same time as a vast increase in available footage! 👍😁

DEtchells
Автор

Researchers use these simulators to generate scenarios that would be too dangerous to stage in real life (e.g., a car going the wrong way on the agent's street or people jumping out from behind a parked car). It's very important if we want to reach full self-driving.

AtriumComplex
Автор

Not a word about Tesla and their approach, to just record everything from every Tesla out there. Not only getting Hughe datasets, but also all the weird edge cases.

lefotografion
Автор

Reminds me of the Westworld tv show where AIs ran simulations to predict the stock market or even to predict specific future events and find the best strategy

nickplays
Автор

I wonder what the implications this code will be for video games tbh. If the code for self-driving cars has already been ported into the game world in order to train the AIs better, it also means you could just leave it there and use it for more realistic and capable AI allies and opponents. Also this means that in the future, upcoming military drone ground vehicles will be able to train against real people in virtual environments, as well as against other drones. I'm sure there's bigger implications i've missed as well

edit: also yes, there are in fact already self driving mini-tanks with 30mm cannons being developed. Dark Tech did a video on them recently I believe

TheOriginalFaxon
Автор

AI learning from GTA makes me .. nervous.

coraltown
Автор

How do you train an AI to challenge itself from an hypothetical situation like Halloween when pedestrians may dress themselves as objects such as traffic lights or buildings?

johnnychang
Автор

this was a nice video, the future isn't necessarily in the hands of autonomous cars but the tech sure as hell will get implemented

megalonoobiacinc
Автор

Each vehicle has cellphone as drivers license, sends out mass, gps vector, brake and turn limits.

brianburke
Автор

Awesome video and this was a very cool topic to learn about

monkeyd