Tricking AI Image Recognition - Computerphile

preview_player
Показать описание
AI Object detection is getting better and better, but as Dr Alex Turner demonstrates, it's far from perfect, and it doesn't recognise things in the same way as us.


This video was filmed and edited by Sean Riley.


Рекомендации по теме
Комментарии
Автор

For Halloween, I'm going to get a sharpie and put dots all over myself, and if anyone asks what I am, I'll be like "I'm a dog!"

generichuman_
Автор

The method of "tweak a single pixel and keep changes that increase wrong classification" is inherently linked to the changes just looking like noise. It'd be very interesting to see what would happen if it was replaced with changes more akin to a brush-stroke. What would the 'paintings' look like?

mikeworth
Автор

Are these generated images extremely brittle?
Does the 99% confidence drop to 0% when you change just one more pixel? Or are they quite robust?

aclkeba
Автор

I'd love to see subtle changes to the image like only allowed to modify a pixel's initial colour through some small range of similar colours to see if you can change the classification while retaining a very similar appearance to the original image.

Potsu___
Автор

Adversarial attacks - love this topic!

Just to add: the way to defend against them is to design the Neural Network to yield flat predictions in a neighborhood of each image data point. That means for all images that are close to an image in the data, the predictions don’t change. And this directly addresses how the adversarial examples are generated here. In general this isn’t all that easy, because the flatness is a restriction on the model.. and that can impact model performance.

Mutual_Information
Автор

Could make for an interesting scifi murder mystery. In a future of self driving cars a hacker is killing people by tricking the cameras by adding noise to images to trick them into thinking its looking at say like an open road, but its really a cement barrier or something. Would be a high tech version of Wiley Coyote drawing a tunnel on a rock!

VonKraut
Автор

Would be interesting to see how these models do with face recognition under similar circumstances. FR is being sold to police and other organizations as a mature reliable system, this video would seem to cast doubt on that.

wktodd
Автор

This can get even scarier.
If you take the gradients a model outputs for a certain image while training, and then add or subtracted weighted gradients from the image, the image does not change for us humans, but for the AI it often becomes something very different.

knicklichtjedi
Автор

He didn't talk about a very important point, you can design an adversarial example working on a model trained on imagenet and apply it to a different model trained on imagenet (which arguably should have vastly different weights) and get similar outputs

thelatestartosrs
Автор

What you want, is to be able to run a randomising blurring algorithm on the input, adding artificial noise, and then a smoothing algorithm on that and then to have a correct identification of the original object in the processed image. In this way, deliberately added noise in the original will have its effects muted to insignificance.

blumoogle
Автор

"working backwards to figure out how a neural network thinks" reminds me of how recently, the Dall-E team showed that outside of the english language, there were some words that the neural network itself "made up" to classify things. Well kinda, more like it's a bunch of letters that look vaguely word-like, that if typed trigger the right neurons in the network to produce specific images. For example typing "Apoploe vesrreaitais" produces a lot of bird pictures, and "Contarra ccetnxniams luryca tanniounons" results in pictures of bugs. Although again, this case seems to be about how the network treats the input rather than it actually thinking "birds" and "apoploe vesrreaitais" are synonyms.

raedev
Автор

Apparently, we need another step in optimization of NNs, respectively another metric that conveys "stability of results". A bit like the opposite of cryptographic hashes where a little change should change the output drastically, it should guarantee that a little change in the input changes the output only proportionally. Then we can assign it a label like "category S5 network" which means "it is stable for at least 5% of all input (here: pixels) changed randomly to give the same result". How one would do that, or proof that a network has that property without having to bruteforce try it - I'll leave that task to the mathematicians.

NFSHeld
Автор

Wonderful job explaining this subject! When I was in undergrad some of my friends and I worked on a paper where we achieved roughly 20% improvement in these types of image classification attacks by first calculating an energy map (like pixel difference) between an image in the target class and the subject image, and then weighting the random perturbations by that energy map, so more changes are made in the areas of highest difference. Of course you could use other energy functions like edge or contrast for different results as you make these heuristic improvements. Really fascinating area of study.

andrewcarluccio
Автор

AI: What kind of dog is that?
Programmer: That's a giraffe.

acidsniper
Автор

You could use the misclassified golfball images to retrain the network by feeding them back in and telling the network categorically, "This is not a golfball." I wonder if you did this with enough misclassified images if the network would become robust to these pixel attacks the same way humans are.

tobuslieven
Автор

A problem I see is the tremendous difference in hue - the neon green pixel on a black background.
Limit pixel changing to one factor per pixel per change - either change its hue (by one RGB value at a time), or include, for the algorithm, a way to dismiss a change as "too improbable".

EnjoyCocaColaLight
Автор

The first problem is the scale invariant. You could make the image larger or smaller (i.e. more or less pixels) and it doesn't fool people for many reasons. Our "training set" is more like videos than still photos. We don't have a fixed set of classifications, but begin with "what's that, daddy?". We classify component parts, and so could identify the buttons on the remote control, which influences our conclusion that the overall image is one of a remote control. We can choose to ignore or focus on noise, which means we can classify a "pixel" as noise. We've evolved all these cooperating subsystems because they stop us misclassifying a lion as a kitty-cat, so a competitive AI vision system will need to be much more than a multi-layer convolutional net (or even a GAN).

BethKjos
Автор

Surely this isn't that unexpected. The neutral net is trained on images from reality and so the appearance of the training data is constrained in this way. It never sees unphysical images. The method of tweaking existing images can lead to unphysical results. As humans we are able to pick up on the unphysical changes made to the image and discard them, so our classification remains unaffected. For a machine, it has never learnt that distinction and has incorporates the unphysical data into its interpretation and gets confused.

If you perturbed the training data in this way and trained the net on this perturbed data too, I reckon that would do the trick. Although maybe these would be too numerous.

rammerstheman
Автор

So overjoyed to find out I'm not the only person on earth anymore who emails themselves things.

jontrout
Автор

What if you trained it with a collection of images which also had random speckles of noise on top? Would it dedicate a layer to denoising? :)

trejkaz
welcome to shbcf.ru