Deep Learning(CS7015): Lec 9.4 Better initialization strategies

preview_player
Показать описание
lec09mod04
Рекомендации по теме
Комментарии
Автор

15:24 Variance of a sum is the sum of variances only if all the covariances are 0, one such case being when all the random variables are independent.

mainakghosh
Автор

Why did sir not consider activation function after first layer in the proof? After calculating S11 there will also be non linearity g(S11) which will be input to second layer and not directly S11. We are proving so that activation has variance and ignoring the activation in proof is just too superficial.

rajatkumarsingh
Автор

I still did not get why we are interested in variance at each neuron.

luckysunda
Автор

How can we calculate the expectation of constant W₁ᵢ because if we put i=1 to n then it becomes constant. The expectation of constant is contant itself.

suhaneshivam
Автор

Why gradient for tanh or sigmoid at x=0 is equal to 0. The value of gradient at x=0 for sigmoid is 0.25

binod