filmov
tv
Expectation Maximization Algorithm | Intuition & General Derivation
Показать описание
The Maximum Likelihood is a great first start for fitting the parameters of a model when you only have access to data. However, it breaks down once your model contents latent random variables, i.e., nodes for which you do not observe any data. A remedy is to take the marginal likelihood instead of the full likelihood, but this approach leads to some difficulties that we have to overcome.
In this video, I show how to derive an upper estimate for the marginal log-likelihood, including all the necessary tricks like importance sampling and Jensen's inequality. We then end up in a chicken-egg problem. Hereby, we need the distribution's parameters to perform an estimate, but we also need the estimate to update the parameters. Consequentially, we have to resort to an iterative algorithm which contains of the E-Step (Expectation) and the M-Step (Maximization).
An Important remark is that the derivations I deliver here are just a framework. For each application scenario, for instance for Gaussian Mixture Models, the framework requires a new maximization to then end up with simple update equations.
-------
Info on why the Expectation Maximization algorithm does not work for the Bernoulli-Bernoulli model:
[TODO] I will work on a video on this, stay tuned ;)
-------
-------
Timestamps:
00:00 Introduction
00:48 Latent means missing data
02:15 How to define the Likelihood?
02:55 Marginal Likelihood
05:05 Disclaimer: It will not work
05:48 Marginal Likelihood (cont.)
06:15 Marginal Log-Likelihood
08:11 Importance Sampling Trick
11:31 Jensen's Inequality
13:03 A lower bound (error, see comments below)
15:23 The Posterior over the latent variables
16:20 A lower bound (cont.) (error, see comments below)
17:56 The Chicken-Egg Problem
20:18 Old and new parameters
21:55 The Maximization Procedure
22:56 A simplified upper bound
25:04 Responsibilities
25:46 The EM Algorithm
28:28 An MLE under missing data
29:07 Outro
Комментарии