CS231n Winter 2016: Lecture 9: Visualization, Deep Dream, Neural Style, Adversarial Examples

preview_player
Показать описание
Stanford Winter Quarter 2016 class: CS231n: Convolutional Neural Networks for Visual Recognition. Lecture 9.

Get in touch on Twitter @cs231n, or on Reddit /r/cs231n.
Рекомендации по теме
Комментарии
Автор

*My takeaways:*
1. Visualize patches that maximally activate neurons 2:30
2. Visualize the weights 3:26: for the input layer
3. Visualize the representation space (e.g. with t-SNE) 5:24
4. Occlusion experiments 8:32
5. Visualize activations 10:21
6. Deconv approaches (single backward pass) 15:33
7. Optimization over image approaches 29:15
8. Deep dream 42:40
9. Neural style 51:55
10. Adversarial examples 1:02:05
11. Summary 1:17:47

leixun
Автор

about reconstruction:
reconstruction vs running throw the NNW <=> integral solving vs derive
in the sense that every node that was activated gives more information then any node that was not.
x^2 + x +15 dx => 2x + 1 => x^2 + x + n
where n is the missing info resulted from no surviving the the boundary function.

neriyacohen
Автор

Andrej, thank you very much for these lectures! My question is, did you work with 3D convolutional nets applied to RGBD-images? Do you think it is possible to extract volumetric features from, for example, terrain, for transferring them to deep Q-net to teach robot walk?

rudmax
Автор

This is a great video, but I can't follow you without subtitle. Would you please add subtitle on the video? Thanks!

yizhao
Автор

Would image reconstruction be as distorted without pool layers? What model was used in the example reconstruction slides?

Best lecture yet, really strengthened intuitions. Keep it up!

Fatt_Dre
Автор

Wonderful!! Thank you for sharing this stuff online!!!

ynwicks
Автор

Does t-SNE need an entire batch of images (or more generally, data) to create the low-dimensional feature space? With PCA you can create a low-dimensional feature space on a batch of data and then project new data points onto that same space without having to "retrain". Is that true for t-SNE?

I ask because I noticed that scikit-learn has t-SNE as part of its manifold class, but that module does not have a transform() method as PCA does. So, at least, in sklearn, it would seem this is not possible.

My question boils down to this. How would you apply t-SNE in a streaming or online situation where you want to continually update the visualization with new images? Presumably, one would not want to apply the algorithm on the entire batch for each new image.

EvanZamir
Автор

At 37:34, where were shown "peaces of ocean" in the center of image and surrounded with gray color. It looks like at this layer everything is in the center of image. Why is it centered? Maybe because conv-network used 0-Padding and sliding windows?

Also I wonder why gray color? Is it due to color avereging in places where neuron doesn't care what is there (and also it can be anything, what signifies border)?

mynameisZhenyaArt_
Автор

54:12 Andrej started to explain style targets and Gram matrix

eluz
Автор

Question! (from an absolute beginner):

For the slide shown @24:52, (The bottom example on "Backward pass: guided bakpropagation") Why don't all the negative numbers map to 0? My understanding was that it should automatically be done to deconstruct the image properly.

Thnx.

heyloo
Автор

37:00 wrong, each of the four images are a different set of hyper parameters for regularization. As for different initialization - you should look at the paper at figure 4 - The 9 images are different initializations.

AvielLivay
Автор

I still cannot understand how to crop the input images to get the corresponding image crops after we get the deconv feature map.

xianxuhou
Автор

Isn't the problem mentioned at the end is just overfitting? Like a deep network has much more parameters in it, than how many elements are in the training set?

realGBx
Автор

Is there an equivalent way to construct adversarial examples for human vision? If not, what, why not? Would that not indicate that at a fundamental level there is something different in the way human and computer vision work?

ChengZhao
Автор

So the reason for the wrong classification of adversarial examples is their linear nature. What happens if we use some other activation functions like tanh?

NM-jqsv
Автор

Adversarial examples are present in human vision too. Right?

RaviprasadKini
Автор

Shouldnt it be inner product and not outer product? @ about 55:20

mehulajax
Автор

why not use l1 normalization for sparsity 36:00

arjunkrishna
Автор

@21:00 I guess, in the backward propagation, the matrix containing zeros are in wrong position as the negative values aren't zeroed out.

questforprogramming
Автор

I wanted to code all that up like you did but I am not able to do it 🥲

tilakrajchoubey