The Reparameterization Trick

preview_player
Показать описание
This video covers what the Reparameterization trick is and when we use it. It also explains the trick from a mathematical/statistical aspect.

CHAPTERS:
00:00 Intro
00:28 What/Why?
08:17 Math
Рекомендации по теме
Комментарии
Автор

WOW! THANK U. FINALLY MAKING IT EASY TK UNDERSTAND. WATCHED SO MANY VIDEOS ON VAE AND THEY JUST BRIEFLY GO OVER THE EQUATION WITHOUT EXPLAINING

sx.
Автор

Sometimes understanding the complexity makes a concept clearer. This was one such example. Thanks a lot.

advayargade
Автор

Thanks, this is a good explanation of the black point of VAE

slimanelarabi
Автор

Thank you for your effort, it all tied up nicely at the end of the video. This was clear and useful.

MonkkSoori
Автор

Very nice video, it helped me a lot. Finally someone explaining math without leaving the essential parts aside.

PaulF-lj
Автор

This was the analogy I got from ChatGPT to understand the problem 😅. Hope it's useful to someone:


"Certainly, let's use an analogy involving shooting a football and the size of a goalpost to explain the reparameterization trick:

Imagine you're a football player trying to score a goal by shooting the ball into a goalpost. However, the goalpost is not of a fixed size; it varies based on certain parameters that you can adjust. Your goal is to optimize your shooting technique to score as many goals as possible.

Now, let's draw parallels between this analogy and the reparameterization trick:

1. **Goalpost Variability (Randomness):** The size of the goalpost represents the variability introduced by randomness in the shooting process. When the goalpost is larger, it's more challenging to score, and when it's smaller, it's easier.

2. **Shooting Technique (Model Parameters):** Your shooting technique corresponds to the parameters of a probabilistic model (such as `mean_p` and `std_p` in a VAE). These parameters affect how well you can aim and shoot the ball.

3. **Optimization:** Your goal is to optimize your shooting technique to score consistently. However, if the goalpost's size (randomness) changes unpredictably every time you shoot, it becomes difficult to understand how your adjustments to the shooting technique (model parameters) are affecting your chances of scoring.

4. **Reparameterization Trick:** To make the optimization process more effective, you introduce a fixed-size reference goalpost (a standard normal distribution) that represents a known level of variability. Every time you shoot, you still adjust your shooting technique (model parameters), but you compare your shots to the reference goalpost.

5. **Deterministic Transformation:** This reference goalpost allows you to compare and adjust your shooting technique more consistently. You're still accounting for variability, but it's structured and controlled. Your technique adjustments are now more meaningful because they're not tangled up with the unpredictable variability of the changing goalpost.

In this analogy, the reparameterization trick corresponds to using a reference goalpost with a known size to stabilize the optimization process. This way, your focus on optimizing your shooting technique (model parameters) remains more effective, as you're not constantly grappling with unpredictable changes in the goalpost's size (randomness)."

mohammedyasin
Автор

This is a life changing video, thank you very much 😊 🙏🏻

abdelrahmanahmad
Автор

Thank you for this video, this has helped me a lot

ettahiriimane
Автор

Thank you for this video, this has helped a lot in my own research on the topic

chasekosborne
Автор

Thank you, I liked your intuition, amazing effort.

amirnasser
Автор

Thank you so much! Please continue with more videos on ML.

salahaldeen
Автор

Beautifully said. Love how you laid out things, both the architecture and math. Thanks a million.

Gus-AI-World
Автор

Your explanation is brilliant! We need more thinks like this. Thank you!

HuuPhucTran-jtrk
Автор

16:27 It's unclear (for me) (in context of gradient operator and expectation) why f_theta(z) can't be differentiated and WHY replacement of f_theta to g_theta(eps, x) allows to move gradient op inside of expectation and "make something differentiable" (from math point of view)

p.s
in practice we train MSE and KL divergence between two gaussians (q(z:x):p(z)) where p_mean = 0 and p_sigma = 1 and it allows us to "train" mean and var vectors in VAE

tempdeltavalue
Автор

I have a small question about the video, that slightly bothers me. What this normal distribution we are sampling from consists of? If it's distribution of latent vectors, how do we collect them during training?

КириллКлимушин
Автор

your'e voice is literally from Giorgio by moroder song

vkkn
Автор

Thanks for the vid 👋
Actually lost the point in the middle of the math explanation, but that's prob because I'm not that familiar with VAEs and don't know some skipped tricks 😁
I guess for the field guys it's a bit more clear :)

my_master
Автор

It is cool although I don't really understand the second half. 😅

wilsonlwtan
Автор

Thank you so much for your video! It definitely saved my life :)

jinyunghong
Автор

The derivative of the expectation is the expectation of the derivative? That's surprising to my feeble mind.

dennisestenson