All Convolution Animations Are Wrong (Neural Networks)

Показать описание

All the neural network 2d convolution animations you've seen are wrong.

Animated AI

Рекомендации по теме

Комментарии

Premise 1: All convolution animations are wrong
Premise 2: This is a convolution animation

Conclusion: this is wrong

rezhaadriantanuharja

you should've started with the typical RGB 3 layer input image, and animate convolutions on that; that's where most people start to get lost as to how the weights match with inputs, translating from the 2D mental model to 3D.

randyekrer

The example just a concept. I don't agree with this sensational title.

kuanarxiv

A major thing that feels missing to me in the animations is clear textual labeling. It's fine that you label them out loud, and then, also, it would be more accessible for folks with hearing challenges or cognitive challenges. My crit aside, this animation is lovely, and I'm very impressed with what you've done. You've earned yourself a new subscriber :)

avidrucker

They are not wrong. They are just displaying a different case than what you are interested in. Maybe they are misplaced in the material you were looking at, but if they were animations for different things, like convolution filters in image processing, they wouldn't be wrong. Have some humility.

allNicksAlreadyTaken

All these wrong illustration and animation trends have been among the many problems where you would think "why the hell have we been doing this all wrong, all the time, everywhere?". Finally, someone came and did the obvious. Thank you!

thomasprimidis

The animation is just meant as an abstraction of the spatial convolution operation itself. A spatial CNN layer consists of spatial convolution operations across multiple input and output channels (which is what you are referring to)

peabrane

Forget the animation itself (even though its great). I just appreciate a non-moving camera. It bothers me so much when people spin the camera around a nice animation in a circle. Makes me feel like I am on a carnival ride.

logon

Oh man, I'm so glad someone took a direct approach to this problem, when I was learning I was so confused by all these animations and explanations in 2D, and then seeing resulting tensor shapes got me super confused, where the depth go and where did it appear? Thanks for bringing this video to the world!

spider

The first animation you say is wring shows the contribution of one filter operations which is quite accurate. is you considered the number of input channels one and out put channels 1 that is the right figure for the whole operation. the conv2d operation are all element-wise matrices multiplication with shifting windows. the 3D animation you did look great but lack of that notion . that is my option. i stick with the 2D.

bediosoro

Instead of spending 95% of the video ranting about how other animations are bad, I would have appreciated it more if you had spend that time explaining how this animation works. I don't think I learned anything from this video.. How do you go from an input RGB image of size W * H * 3, to some cube of size 5 * 5 * 5 (+padding)? You lost me at step 1..

tomo

well, not speak for the existing animations/figs, i won't say they are wrong, they have some issues, but essentially they are correct. When talking about 2D convonlution, we should know the input and output are 3D as input is a picture and output is also a picture/feature map.

pew_pew_pew

So in case of a feature map input, 2d conv just replicate each 2d filter along the feature dimension and do multiplication wise? In the video, the filters are 2d really just replicate to fill in the the number of features? or does each 2d filter is in reality a 3d tensor to match the feature dimension?

grjesus

Unfortunately, only half right. How about if we need to understand 4D or 5D convolution situation? Humans can understand 2D most intuitively and I think this is a reason for why made that 2d based animations. (And 2d convolution can extending to a larger dimension.)

And deep learning convolution is unfortunately not mathematically organized. It is derived from "filter" in image processing. and "filter" also derived from "cross correlation" long before.

You are animation have a multiple kernels, It just depict an argument called "channels" that is only used by "Neural Network" frameworks.

devjeonghwan

Thanks for that. It was really confusing before your animation came up!

felipelourenco

Not wrong bro. They are just incomplete.

shuninc

They are not wrong. They are a simplification that helps to understand the concept. As any simplification they are incomplete. But not wrong. It's sad that you use clickbait titles.

alexeychernyavskiy

"a 2D convolution actually takes in a 3D tensor as input and has a 3D convolution as output", well, it depends right? If you have a single channel/grayscale image then the input is in fact a 2D tensor, and each feature outputs a 2D tensor that is joined with all others in the feature map. So if you have a grayscale image with a single feature, the animations would in fact be correct.

I think the animations are perfectly fine, as they simplify a concept to it's most basic form for easy understanding. But it is true that after you understand the basic concept, a 3D - 3D representation is also nice to understand more common and complex examples.

Disclaimer that I could be wrong as I am by no means an expert, but this is my take from my current understanding of convolutions :)

pere_gin

amazing. you have cleared all my doubts in single shot

kartikpodugu

The use of all these misleading animations is the primary cause of misconception about convolutional neural networks; you have finally provided a good visualization. I am happy to share this content with my colleagues.

PeppeMarino

All Convolution Animations Are Wrong (Neural Networks)

All Convolution Animations Are Wrong (Neural Networks)

Source of confusion! Neural Nets vs Image Processing Convolution

Convolution animation manim

Kernel Size and Why Everyone Loves 3x3 - Neural Network Convolution

neural networks animation

2D Convolution Neural Network Animation

Convolution Explainer (Animation)

Convolutional Neural Networks Explained (CNN Visualized)

Why do Convolutional Neural Networks work so well?

Fundamental Algorithm of Convolution in Neural Networks

Groups, Depthwise, and Depthwise-Separable Convolution (Neural Networks)

AI CNN Convolution Animation, AI instructional videos, Neural Networks, Deep Learning Python, L2

Stride - Convolution in Neural Networks

Convolution Padding - Neural Networks

Convolutions are not Convoluted

Filter Count - Convolutional Neural Networks

How REAL Men Integrate Functions

Visualization of a Convolution Neural Network (CNN). Made with Unity 3D and lots of custom code

What is convolution? This is the easiest way to understand

Operations in Convolutional Neural Networks | Convolution, Pooling and Fully Connected Layer

Artificial Intelligence CNN Convolution Animation, AI Neural Networks, Deep Learning Python, L3

2D Convolution Explained: Fundamental Operation in Computer Vision

Visualization of a fully connected neural network, version 1

How to Understand Convolution ('This is an incredible explanation')