Kernel Size and Why Everyone Loves 3x3 - Neural Network Convolution

preview_player
Показать описание

Find out what the Kernel Size option controls and which values you should use in your neural network.
Рекомендации по теме
Комментарии
Автор

The basic reason we don't use (even number) x (even number) layers, is because those layers don't have a "center". Having a "center" pixel (as in a 3 x 3 configuration) is very useful for max and average pooling - it's much more convenient for us.

IoannisKazlaris
Автор

This is honestly the best video related to machine learning in general I have seen, amazing work. Most people just pull architectures out of thin air or make a clumsy disclaimer to experiment with numbers. This video shows 3d visual representations of popular CNN architectures, and really helps you build all cnns in general.

axelanderson
Автор

I have some hobbyist signal processing experience of a few decades, and these new methods seem so amateurish compared to what we had in the past. FFT, FHT, DCT, MDCT, FIR filters, IIR filters, FIR design based on frequency response, edge adapted filters (so no need for smaller outputs), filter banks, biorthogonal filter banks, window functions, wavelets, wavelet transforms, laplacian pyramids, curvelets, counterlets, non-separable wavelets, multiresolution analysis, compressive sensing, sparse reconstruction, SIFT, SURF, BRISK, FREAK, yadda yadda. Yes we even had even length filters, and different filters for analysis than for synthesis.

FrigoCoder
Автор

Such an amazing video. Your going to hit 50k soon! Keep this up!!!

matthewboughton
Автор

Thanks for the effort of maxing this excellent visualization! This creates a very good intuition for how convolutions work and why 3x3 is dominant.

schorsch
Автор

There was no reason that I should have this very question and there had to be a great video telling me the exact reason why on the internet. Bless!

josephpark
Автор

friking love your videos! Keep up with your awesome work! :D

alansart
Автор

good explanation. Looking forward to more.

ankitvyas
Автор

great work, your video do help me a lot👍

travislee
Автор

I think odd sized filters are mainly used since we often use a stride of 1. Each pixel (except for the edges) will then be filtered based on the surrounding pixels (defined by the kernel size). If the kernel size is even the pixel that the kernel represents would be the average pixel of the 4 middle pixels. It introduces a sort of shift of 0.5 pixel. I think it might be fine mathematically speaking, but it feels odd or wrong. Also if you worked with Gaussian filters (which I assume many CNN researchers has) you are literaly forced to use odd sized filters there.

newperspective
Автор

Perhaps 2x2 kernel is a common trick for learnable stride-2 downsample kernel and upsample deconvolution kernel. It is a more likely about computation efficiency instead of network performance, because such kernels are almost equivalent to downsample/upsample followed by a 3x3 kernel. In this regard, 2x2 combo with stride-2 down/upsample operations do not shrink the resultant feature map size by 2 as 3x3 kernel does, possibly beneficial to image generation tasks. In GAN, 2x2 or 4x4 kernels are commonly found in discriminators which emphasize non-overlapping kernels to avoid grid artifacts.

maxlawwk
Автор

this is awesome and is inspiring me to learn blender!

sensitive_machine
Автор

In the Unet, GAN architecture when it is required to generate a feature map half of its actual size a 4x4 kernel size is used.

pritomroy
Автор

About 2*2 filter, a paper <Convolution with even-sized kernels and symmetric padding> maybe helpful

haofanren
Автор

Wow really beatiful animations, great job! However I got kinda confused since I always saw convolution in 2d haha

naevan
Автор

Hi Animated AI, thanks for your great video. I have below question:
4:45 indicated the color of filters (i.e. red, yellow, green, blue) represent the "Features". A filter (e.g. the red one) itself in 3-dimension (Height, Width, Feature) also include "Feature". Thus, the "Feature" appear twice. Please could you advise why we need "Feature" twice?

bengodw
Автор

really helpful for understanding the concept, correct me if i'm wrong, so for the first conv2d layer, it will always contains 1 feature for black and white image, and 3 features for rgb image. And after that the number of features increases depending the number of filters used in the convolution.

danychristiandanychristian
Автор

The even size of the kernel does not allow symmetrical coverage of the area around the pixel

kznsq
Автор

Is there a good reason for why filter sizes of even numbers aren't used at all, except that the padding will be uneven if using "same"?

fosheimdet
Автор

"we dont talk about the goose goblin" - MadTV

yoursubconscious