cs294a Sparse Autoencoder Lecture Part 2

preview_player
Показать описание
Stanford CS294A Sparse Autoencoder and Unsupervised Feature Learning Lecture Videos
Рекомендации по теме
Комментарии
Автор

He's wearing such a nice pullover <3

davidpomerenke
Автор

This for 2011!! nearly a decade old. No wonder he looks much younger!

also creating an sparse auto encoder can be achieved easily using other methods as well!

amortalbeing
Автор

The loss function which is regular loss function + KL divergence, also missing are the derivatives of the loss function.

mantrava
Автор

It seems that the lecture note on the sparse autoencoders is different from the video. The lecture note uses KL divergence, but this is not mentioned here. Is the update scheme presented in the video doing the same? I assume using the KL regularizer, it would affect all the weights, while the scheme shown in the video will only affect the bias.

juliusctw
Автор

What is the intuition behind using the sparse constraint as opposed to fewer neurons?

clapdrix
Автор

Hello,

as I saw your graph, I was a little concerned about the perfectioness of your representation and the interpretability of the hidden layer.

Did you see the the original paper of olshausen and field about sparse coding, claiming than an overcomplete basis set really leads to derivable lower level featuremaps which are consecutive in terms of their distribution propagation.

I anyways like the idea of compressed sensing for information filtering on higher level analysis tasks.
I think it is possible to combine the outcome, using residual learning, with further layers build on the learned features (classification, convolution or noise cacelling for instance).
But correct me if I am wrong.

-M

__MGR___
Автор

thx for the great post. I have a question about the parameter updating. The Eq1 and Eq2 with respect to updating p and b are not referring to the derivative terms, why do we still use SGD on training?

anynamecanbeuse
Автор

Hello to my fellow video watchers! I had a doubt, what is that X_j ~ N(0, sigma) and why would it be difficult to learn the compress representation from it?

akshathvarugeese
Автор

in minute 32 how calculate the input that maximize the hidden unit i?

amanma
join shbcf.ru