How does CLIP Text-to-image generation work?

preview_player
Показать описание
Join this channel to get access to perks:

Рекомендации по теме
Комментарии
Автор

Great talk - thanks. So the image is generated, and CLIP assesses how close it is to the prompt. But which algorithm actually performs the step where the section in the middle of the noise image is then changed into the dolphin's nose? Is there a third process involved, in addition to the image generator and the CLIP, or does the image generator continue to make alterations to the noise image until CLIP says "finished?"

dennishmiller
Автор

cheers man.. i've been using diffusion for a while.. but i'm interested in learning others deeper.. this has helped :)

serloinz
Автор

Hello. There's something I don't understand.
When you value CLIP the image, in every iteration or within the convulsion?

lacapi_tv
Автор

Can you recommend some of the Discord channels you mention towards the end of the video?

FLANCKE
Автор

nice overview, thanks! is there a website for your ITP course?

socalledsound
Автор

now that so many GAN images are being posted, I wonder if future GANs will generate images to look like old GANs because they're scraping the stuff old models have generated.

jameshughes
Автор

Great video and channel. Would you consider covering Nvidias image generators like GauGan which have made incredible progress aswell?

RokasJovaisa
Автор

Great video, but probably should be about 5 minutes lol, a lot of skipping to get to the meaty parts

beecee