Neural Networks From Scratch - Lec 15 - GeLU Activation Function

preview_player
Показать описание
Building Neural Networks from scratch in python.
This is the fifteenth video of the course - "Neural Networks From Scratch". This video covers the GeLU activation function and its intuition in detail. We look at the derivative of GeLU and discuss the advantages and disadvantages of using the GeLU activation function. We looked at the performance comparisons of mish against relu and elu activation functions and finally, we saw the python implementation

Neural Networks From Scratch Playlist:

Activation Functions Playlist:

GeLU Activation:

Please like and subscribe to the channel for more videos. This will help me in assessing your interests and creating more content. Thank you!

Chapter:
0:00 Introduction
0:14 Motivation
1:18 Intuition & Deriving GeLU
5:35 Definition of GeLU
6:30 Derivative of GeLU
7:04 Performance comparison
7:38 Python Implementation

#geluactivationfunction, #geluactivationfunction, #geluactivationfunctioninneuralnetwork, #reluactivationfunction, #activationfunctioninneuralnetwork, #vanishinggradient, #selfgatedactivationfunction, #dropout
Рекомендации по теме
Комментарии
Автор

Watch my latest videos on YOLO object detection models:

MLForNerds
Автор

Wow!! absolutely best explanations, please keep up your great videos

sobuzvisual
Автор

These videos are totally underrated, please continue. Do you plan to upload more? for instance on language models.

eatalittlel
Автор

For some values of x the GELU has a negative derivative. This seems like it would impede training, decreasing a weight when it should be increased. Why then does GELU still perform better?

dnaphysics