Neural networks [5.4] : Restricted Boltzmann machine - contrastive divergence

Показать описание

Hugo Larochelle

Рекомендации по теме

Комментарии

Can we say intuitively that the model learns by the difference between the given sample and the general pattern it learned so far?

revolutionarydefeatism

hi Hugo, I cant see spliting the partial derivertive into a positive phase and a negative phase a very obvious step..：（ where does this trick come from and why? thanks!

fengji

I am confused with the equation at 2:28.
Could you explain why this partial derivative is composed of a Positive phase minus a Negtive phase?

ayakoyamaguchi

La fase negativa hace referencia a la función de partición, o en que video se aproxima esa función de partición

duvansepulveda

If yes, it's weird, because loss function on training Neural Network requires class label while we are doing unsupervised learning here.
Could you please help to explain ?

SonNguyen

Hugo, Thanks for the slides. I have a question about why the 2nd term is not tractable. Some paper says it runs over 2^m states. Could you explain a bit more detail? Thank you very much

teacher

Hello Hugo,
Thank you very much for your excellent lectures series. I love your lecture a lot.

Regarding to this RBM lectures, I have some questions
1. I found many paper deliver the contrastive divergence from "KL distance". In your lecture, you started with average loglike lihood, is there any explanation for this? It make me a bit confusing.
2. Is there any further reading to understand where is "possitive phase" and "negative phase" come from?

Bests,

ThuongNgC

Around 2:50 notation can be confusing since you are using E for expectation and E for the energy function. I can see Energy appears on italics but still... just for clarity... can be different.

dgg

Very good, congrats!

I've read a lot about RBM and CD and just now I think I have understood.

I have a question: are there other modern ways to train RBM?

Regards from Brazil.

andtenorio

Hey, Hugo! I'm quite confused with PCD. If we use the previous iteration gibbs gampling result instead of the current train sample, is that mean all we need is just one sample, other samples are not used? I'm confused...

youlihanshu

Hello Hugo,
i did not understand how did you remove the expectations (E_{x} and E_{x, h} disappeard) starting from 10:36.
If you can give more explanation it will be good.
thanks anyway

mahmoudalbardan

Thank you for the video, it really helped me understand the details of CD a lot better

graufx

@5:00 So, basically, E_{x, h}[.] = E_x E_{h|x} [.] ~ E_{h|xtilde}[.], where xtilde~ is sampled from p(x) using Gibbs sampling.

kiuhnmmnhuik

Hi, Hugo. Firstly thank you for this nice video. I have a question in the procedure of Gibbs Sampling. At about 10:19, you said the sampling will terminate at k step. Is this k the dimension of the data? Personally I don't think it should be, because it seems there are no connections between the data dimension and sampling steps, But if it is, please tell me why. Thank you.

chenwang

Hey, why is the average NLL used to calculate the log of the function?, what is the reason behind it?. If you can recommend a paper or a previous video that will be cool, thanks.

michaelosinowo

Hi Hugo,
first of all, a big thanks for the tutorial videos. I'm trying to express my understanding about the first of the 3 main ideas of CD. You said it is to replace the expectation by a point estimate at x_tilde. My question is - am I right if I try to justify the 'point estimate' of an expectation as follows?
You want to compute the expectation of a function and you don't have a means to do it exactly for some reason. But you have a way to find the most probable (expected) value of x (the random variable with whose respect the expectation is being calculated), then assume that the whole probability mass is concentrated on that particular value of 'x' (the particular value is denoted as x_tilde). Now compute the value of the function and then multiply with the probability of that x_tilde (which is 1) to get an estimate of the expectation. So the expectation value is basically the value of the function at the most probable(expected) value of the random variable.
Hope I'm clear with my question. And thanks again for this wonderful series of videos.

pathbholapathik

Very clear and visual, thank you so much!

anirudhsingh

Thank you so much for the excellent explanation

snigdhapurohit

Dear Dr. Larochelle, thank you so much for your lectures and for making them public.

shamimabanu

Hello Hugo,
I would like to ask a question about the wake-sleep algorithm and contrastive divergence. I thought they seem similar in their basic idea. Are there any differences between them? Could I take it that wake-sleep is used to pre-train a DBN, while contrastive divergence is used to train a RBM?

yuanhuang

Neural networks [5.4] : Restricted Boltzmann machine - contrastive divergence

Neural networks [5.1] : Restricted Boltzmann machine - definition

Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutorial | Edureka

Neural networks [5.3] : Restricted Boltzmann machine - free energy

But what is a neural network? | Chapter 1, Deep learning

Restricted Boltzmann Machines in 60 seconds!

Introduction to Boltzmann Machines

[4/5] Rémi Monasson (2018) Unsupervised neural networks: from theory to systems biology

Deep Neural Network

Neural networks [5.5] : Restricted Boltzmann machine - contrastive divergence (parameter update)

An Old Problem - Ep. 5 (Deep Learning SIMPLIFIED)

Neural Networks 5: feedforward, recurrent and RBM

Deep Neural Network (DNN) | Deep Learning

The Boltzmann Machine – The Most Important Energy-Based Neural Network #shorts

Restricted Boltzmann Machines (RBM) - A friendly introduction

Deep Learning Part - II (CS7015): Lec 18.3 Restricted Boltzmann Machines

Deep Neural Network Introduction

NEVER buy from the Dark Web.. #shorts

Feed Forward Network In Artificial Neural Network Explained In Hindi

What is a Neural Network | Neural Networks Explained in 7 Minutes | Edureka

Lecture 15 | (4/5) Recurrent Neural Networks

Applied Deep Learning 2022 - Lecture 4 - Recurrent Neural Networks

Neural networks [9.10] : Computer vision - convolutional RBM

Attention mechanism: Overview

Gradient descent, how neural networks learn | Chapter 2, Deep learning