When Should You Use L1/L2 Regularization

preview_player
Показать описание
Overfitting is one of the main problems we face when building neural networks. Before jumping into trying out fixes for over or underfitting, it is important to understand what it means, why it happens and what problems it causes for our neural networks. In this video, we will look into L1 and L2 regularization. We will learn how these regularization techniques work and when to use them.

❓To get the most out of the course, don't forget to answer the end of module questions:

👉 You can find the answers here:

RESOURCES:

COURSES:

Рекомендации по теме
Комментарии
Автор

Wow thank you for this video!!! this 8min video was better than my instructor's 8hour class on the same topic.

hanipmanju
Автор

the model that lets you use both L1 and L2 regularization techniques is called Elastic Net. it has an extra parameter which takes values in range of 0 and 1. I just read about it yesterday in a book. Anyway, thanks for this great series, i am a complete beginner to NNs, and this series is helping me a lot in understanding the big picture and all the basic concepts and procedures of NNs.

oo_wais
Автор

The goal of regularization is to spread out the transfer from one layer to the next to as many connections as possible, thereby forcing the network to consider many aspects of the connection between the input and output. This is done by penalizing 'tunnelling' through few connections. And that is exactly that penalizing large weights does.

NisseOhlsen
Автор

Ok I subscribed! Like I'm a simple NN I see talent I converge to my optimum solution

andresg
Автор

Mısra Hanım merhabalar. Örneklem sayısı az olan bir veri seti ile Ridge regresyon yöntemini kullanarak bir model oluşturmak istiyorum. Ancak modeli oluştururken çözümü el ile yapacağım. Bu konuda yardımcı olabilir misiniz?

kylmaz
Автор

5:40'da parametrenin adini "alpha" diye belirtmissiniz, λ degil mi dogrusu?

bay-bicerdover
Автор

L1 was solid, I wish L2 was explained as well as L1.

wut
Автор

Why can't I use alpha>1? Also doesn't this fail for networks with batchnorm for example?

grekiki