CS231n Winter 2016: Lecture 11: ConvNets in practice

preview_player
Показать описание
Stanford Winter Quarter 2016 class: CS231n: Convolutional Neural Networks for Visual Recognition. Lecture 11.

Get in touch on Twitter @cs231n, or on Reddit /r/cs231n.
Рекомендации по теме
Комментарии
Автор

I keep watching this series over & over, hoping some intelligence will back-propogate into my brain.

pauldacus
Автор

the guy making lecture should REPEAT QUESTIONS from the audience

wiiiiktor
Автор

*My takeaways:*
1. Making the most of your data
- Data augmentation 4:28
- Transfer learning 13:14
2. All about convolutions
- How to arrange them 25:00
- How to compute them fast 36:22
3. 'Implementation details'
- CPU/GPU 49:44
- Distributed training 56:27
- Bottleneck 1:02:19

leixun
Автор

Is there a wrong in page81?
FFT has bigger speedups for large kernels

HuangBrian-gn
Автор

Here is what I found when using ideas from "the power of small filters" slides: it's harder to train (training takes longer or doesn't converge at all), and it's slower to run inferences with CPU. The final model is smaller (using less memory), but it seems to take more computation, so in practice the slides are wrong. The same occurs with Squeezenet architecture, which I've been using.

barbolo
Автор

Seems to be a mistake in the "Computing Convolutions: Recap" slide - it says "FFT: Big speedups for small kernels" when it should be "big kernels"?

GohOnLeeds
Автор

Are there any other lectures taught by Andrej? I'm not enjoying the second half of this course.

essamal-mansouri
Автор

Why is it necessary to add a 3*3 conv layer in 33:11?The same output can be obtained by using a singe 1*1 conv with C channels

nihalthukaram
Автор

aren't some of the newest models running on int8 precision for edge computing? lmao

Supreme_Lobster
Автор

Great lecture.
I want to ask how I can prepare my data so it can be used in conv3D. Some videos do not have the same number of frames. Please, if you have a complete example of how to implement 3D CNN and RNN in kersa that works with video stream, send it to me.
I read you paper large scale video classification with CNN. Is your code is available?Thanks

hanyel-ghaish
Автор

How to make all your training data set images the same shape? Just by re-sizing them, I afraid that I could lose the important features. Is the 10 crops technique will be good to resolve this problem?

kmagzhan
Автор

Can anyone answer my question? I'm appreciated. At 26:20, the answer is 5x5 for stride equals to 1. However, if the stride is 3, the answer will be changed, right?

JS-ggpx
Автор

Great lectures. One thing I've been wondering about, which I haven't seen discussed yet, is how does the number of class labels affect learning/training of CNN's? For example, I want to use a model trained on ImageNet (1000 classes) to classify an image set that only has 2 class labels. My understanding is that fewer labels makes it more difficult to train models. Is that true? Will you discuss that in these lectures?

EvanZamir
Автор

I understood the data augmentation part

agnesakne
Автор

Hi, I wonder how we define 'small data set' and 'medium data set'
How about a data set with 50000 pictures?

jingkangY
Автор

When Justin refers to Linear Classifier, does that include SVM as well?

robromijnders
Автор

At 26:02 can somebody explain why is 5x5 the answer?

iloveno