Lesson 3: Practical Deep Learning for Coders

preview_player
Показать описание
UNDERFITTING & OVERFITTING

We’ve now seen a number of ways to build CNNs—it’s time to build a more complete understanding of how they work. In this lesson we review in more detail what a convolution does, and how they are combined with max pooling to create a CNN. We also learn about the softmax activation function, which is critical for getting good results in classification models (a classification model is any model that is designed to separate data items into classes, that is, into discrete groups). NB: If you’re having trouble understanding the convolution operation, you may want to skip ahead and watch the start of lesson 4, since it opens with a spreadsheet-based explanation of convolutions.

Then we delve into the most important skill in creating an effective model: dealing with overfitting and underfitting. The key is to first of all build a model that overfits (since then we know we have enough model capacity and know that we can train it) and then gradually use a number of strategies to reduce the overfitting. In this lesson the most important section is where we look at the list of techniques used to address overfitting. We suggest copying this list somewhere convenient, and refer to it often; ensure you understand what each of the steps means, and how to do them.

We then look at two particularly useful techniques to avoid overfitting: dropout, and data augmentation. We also discuss the extremely handy technique of pre-computing convolutional layers. Make sure you understand this technique before you continue, and practice it yourself, since we’ll be using it in every lesson and every notebook from here on!
Рекомендации по теме
Комментарии
Автор

Finetuning starts at 59:16
Under/overfitting starts at 1:12:45
Data Augmentation 1:31:18
Batch Normalization 1:40:00

channelpanel
Автор

Hi Jeremy. I have an ANN model, which predicts a set of parameters, from some time-data. I have a model with increasing training accuracy (upto 96-97%) but with a validation accuracy is flat throughout(~80%). Even when I have tried modifying the training sets, I have used dropouts, experimented with the learning rates and have used k-fold validation. What are the ways to deal with such flat validation curves?

madhurocks
Автор

When finetuning, is there benefit to adding another set of layers at the end of the model? (I am thinking of scenarios when we are attempting to classify beyond image object classes, but instead look at other higher level concepts like style or sentiment?)
If so, would we randomise the weights for these new layers?

mbkhan
Автор

What was the paper referenced to at 55:40 about how filters with size 3 are the best?

melonkernel
Автор

Is it worth to have different learning rates for each layer?

ricebowl
Автор

Amazing video! I have no idea why this video doesn't have more views???



EAT THAT CNN! Feed Forward is !!!

input = Input(shape = model_details['input_shape'], dtype='float32')
network = Dense(300, activation='relu')(input)
network = BatchNormalization()(network)
network = Dropout(0.5)(network)
network = Dense(300, activation='relu')(network)
network = BatchNormalization()(network)
network = Dropout(0.5)(network)
output = Dense(10, activation= 'softmax')(network)
model = Model(inputs =input, outputs =output)
model.compile(loss='binary_crossentropy', optimizer =Adam(0.0008, 0.05), metrics=['accuracy'])

Batch_size = 128

Epoch:27
step - loss: 0.0099 - acc: 0.9969 - val_loss: 0.0097 - val_acc: 0.9972

happydays