From Autoencoders to Variational Autoencoders: Improving the Loss Function

preview_player
Показать описание
Autoencoders have a number of limitations for generative tasks. That’s why they need a power-up to convert them into Variational Autoencoders. In this video, I explain the second step to transform a vanilla autoencoder into a VAE. Specifically, I discuss how VAEs add a regularization term to their loss function, implemented through the Kullback-Leibler Divergence.

===============================

Slide deck:

Join The Sound Of AI Slack community:

===============================

Interested in hiring me as a consultant/freelancer?

Follow Valerio on Facebook:

Connect with Valerio on Linkedin:

Follow Valerio on Twitter:

===============================

Content
0:00 Intro
0:44 Autoencoder loss
1:42 VAE loss
3:08 Kullback-Leibler Divergence
8:02 Weighting the loss function
9:45 What's next?
Рекомендации по теме
Комментарии
Автор

Another great video Valerio! Looking forward to the next one in the series.

pcasabianca
Автор

Looking forward to the next video, very nice work.

mostafahasanian
Автор

I was looking forward to the next video on 'Implementing VAE'. It has been two weeks!

avidreader
Автор

Valerio when is the next part of the series coming? Thank you for this wonderful series <3

TheMagicmagic
Автор

Great Work will be perfect and more simple if all your code is done without classes in jupyter notebook.

loicbaconnier
Автор

I have seen in some blogs that when you take the combined loss, they take the mean of two K.mean(recon_loss + kl_loss). What is your opinion?

techwithnavinx
Автор

Can anyone tell if the KL is forward or reverse?

adityamehra
Автор

Woderful ...wondering if I could get transcript. God Bless

asheeshmathur
Автор

Please create course on speech to text and text to speech
And thanks for this wonderful course

satish
Автор

I'm just wondering about how you actually apply these algorithms to sound. I guess there is a lot of materials showing how to generate images but the problem with sound is actually imputing a bunch of spectrograms into the VAE, then sampling the latent space to generate a new spectrogram and actually convert it back to audio. I can't find any resources that explain this audio generation process in a simple way. I would love to see a video where you cover specifically this audio process.

GuilhermeCosta-nvzm
Автор

Great video Valerio. I have a question, the reconstruction loss in Auto Encoder is compiled at the end of the encoder-decoder structure like the autoencoder as a whole. Here the KL divergence seems to be affecting the encoder only. So how do we compile the losses of KL in the implementation?


Looking forward to the rest of the videos.

Saitomar
Автор

Are you going to do something in python??

Jorge-wftg
Автор

Sir please create course on text to speech synthesis

saigeeta
visit shbcf.ru