Stable/Latent Diffusion - High-Resolution Image Synthesis with Latent Diffusion Models Explained

Показать описание

Рекомендации по теме

Комментарии

I love how they combined VAE, GANs (adversarial loss) and diffusion models

gonzalorubio

This is the only detailed, non-hype based walkthrough of how SD works, thanks. Especially for explaining the math.

thecheekychinaman

These videos are criminally underrated! This one, ViT, Attention and LoRA have helped me so much with my learning! As a Compsci student majoring in AI, going from learning lectures and reading books, to reading, understanding and implementing the actual papers is a big leap, and you've made that leap a lot more simpler and digestible. Thank you so much, please never stop this series!

hiepphamduc

This was a really good video. It really helped me understand this diffusion concept that I didn't know about. Your videos are underrated, but I have no doubt they will gain traction over time.

acasualviewer

Congratulations on the video. I've always had doubts about whether Stable Diffusion is the same thing as Latent Diffusion. Now with your explanation I understand that they are the same thing.

claudeclaude

Thank you so much Gabriel! I wanted to understand the intuition behind Latent Diffusion, and watching your video saved me tons of time from actually reading through the paper.

lzh

Extremely underrated video. Thanks so much for all the explanations!

aesadugur

Thank you for the video, it is the first video I found about training AE in LDMS, and I think this part is the hard part to understand the whole model, thanks for your explanation, it is very easy to understand. One thing I would like to add is that the AE in the paper is based on VQ-VAE, so L_rec uses perceptual loss and L_adv is a patch-based adversarial objective. Anyway, I hope you will continue to work on this series!

Anonymou

Really enjoyed every minute. Got a new subscriber

AI_For_Scientists

Hello, thanks for the explanations! just a few words on the greek letters. its "psi" not "phi" here, and the "rho"_theta you mention is actually a "tau"

ahamuffin

thanks for explanation, man! amazing video!

denistimonin

Fantastic videos! Any plan for the recently published ControlNet?

alexxiang

Good video, but didn't explain how the cross attention output is actually being used in the UNET

NadavBenedek

Does the text emb serve as a label in this model? For example, i put a pic of peguin and describe it "peguin". The model learns to match pic and text and reduce loss

hunterli

Hey, what's the setup you are using to write and see the paper on split-screen?

prateekpani

Thank you very much for an amazing explanation!
one question tho, on minute 22:00 when explaining the Autoencoder loss function, you activate the log function on the output of the discriminator, isn't that a bit problematic ( since log is not defined in the case that the discriminator predicts "fake")?

gabrielsamberg

How do you know the "epsilon" ? Meaning for the second training step, where you are doing MSE ( noise, predicted noise), how do you know the "noise" before hand? Is it coming through a function "P" ? Also what does it mean to train the diffusion layers? Are diffusion layers also like convolution?

tushargarg

why is it better to predict the noise instead of the denoised image directly in the UNet? Thanks for your videos.

anoubhav

Really nice Video. I have a doubt about the latent loss, 1) Are they trying to fake the encoder-decoder part and the real input real?? 2) if the above assumption is true then they need to put plus sign for 2nd term in the loss function (i.e output of discriminator with encoder-decoder output as input)

oxxdkvy

I love your videos, been following you since the girlfriend video, can you please explain RWKV models

namidasora

Stable/Latent Diffusion - High-Resolution Image Synthesis with Latent Diffusion Models Explained

Stable/Latent Diffusion - High-Resolution Image Synthesis with Latent Diffusion Models Explained

High Resolution Image Synthesis With Latent Diffusion Models | CVPR 2022

How does Stable Diffusion work? – Latent Diffusion Models EXPLAINED

Stable Diffusion: High-Resolution Image Synthesis with Latent Diffusion Models | ML Coding Series

High-Resolution Image Synthesis with Latent Diffusion Models

Stable Diffusion High resolution image synthesis with latent diffusion models

What is Stable Diffusion? (Latent Diffusion Models Explained)

High-Resolution Image Synthesis with Latent Diffusion Models (220313)

AI-powered Digital Pathology: From Skin Lesions to ChatGPT

HOW TO MASTER AI UPSCALING/UP-RES | Stable Diffusion Web UI

[Stable Diffusion] High-Resolution Image Synthesis with Latent Diffusion Models

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Explained

PR-448: SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Stable Diffusion & Friends: High-Resolution Image Synthesis via Two-Stage Generative Models

Stable Diffusion: High-Resolution Image Synthesis with Latent Diffusion Models

Stable Diffusion - What, Why, How?

Coding Stable Diffusion from scratch in PyTorch

(Stable Diffusion) High-Resolution Image Synthesis with Latent Diffusion Models #diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

How to AI Upscale with ControlNet Tiles - High Resolution for Everyone!

High-Resolution Image Synthesis with Latent Diffusion Models

nvidia - High Resolution Video Synthesis with Latent Diffusion Models | DANIEL PIKL

Stable Diffusion in Code (AI Image Generation) - Computerphile

Align your Latents - High-Resolution Video Synthesis Explanation