InstructPix2Pix Explained - Edit Images with Words!

preview_player
Показать описание
Join me as I explain the core idea behind 'InstructPix2Pix' which let's you edit images using natural language instructions!
The thumbnail was generated with "Add sunglasses", "make it in a city at night" and "give him a leather jacket" - far better than messing about with photoshop :)
Рекомендации по теме
Комментарии
Автор

Something I forgot to mention: They generated 100x more training data and filtered to pick the 'best' results for training! One way to try and improve the data quality I guess :)

Yenrabbit
Автор

Was hoping you'd cover this, great video!! Thanks

lkewis
Автор

Thanks for the great video! Question: At inference time, is z_t a randomly sampled vector or is it a diffused version of the input image?
Cause if it's the latter, then they're passing the original image information in 2 ways (the initial latent and the image conditioning)

aliteshnizi
Автор

Thanks for cool review. I got some question while reading paper,
I think this model can not only overall style transfer but also do localized object change. But there is no direct hint that this model can infer where to change on image like masking or swaping word attention map. I guess localizing ability of this model came from generated dataset (instructions for GPT and images from Prompt2Prompt) eventhough balancing guidance level might also affect. Whats your opinion about this?

imaine-qnvv
Автор

Could you please explain the loss function in a bit more detail?
Thanks

adityakharbanda
Автор

this is not working on updated stable diffusion webui and forge

parthwagh
Автор

Love u bro, u will me me bilioner, greate fucking work !!!

kornellewychan
welcome to shbcf.ru