StackGAN Implementation| Text to Image Generation with Stacked Generative Adversarial Networks

preview_player
Показать описание
Tensorflow implementation of StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

“Generative Adversarial Networks (GAN) is the most interesting idea in the last ten years in machine learning.” — Yann LeCun, Director, Facebook AI

Implementation:
StackGAN: Text to Photo-Realistic Image Synthesis
Model Architecture of StackGAN
Preparation of Dataset
Implementation of Stage I of StackGAN
Implementation of Stage II of StackGAN

At the end of this video, you will have a working model of StackGAN research paper, generating photorealistic images from text. Moreover, you will have a knowledge of how to train StackGAN for your own dataset or problem statement.

Example:
(Text Input): The bird is black with green and has a very short beak
(Output — Generated photo-realistic images)

The model architecture of StackGAN consists of mainly the following components:
Embedding: Converts the input variable length text into a fixed length vector. we will be using a pre-trained character level embedding.
Conditioning Augmentation (CA)
Stage I Generator: Generates low resolution (64*64) images.
Stage I Discriminator
Residual Blocks
Stage II Generator: Generates high resolution (256*256) images.
Stage II Discriminator

#stackgan #gans #generativeadversarialnetworks #neuralnetworks #ai #deeplearning #computervision #deeplearning #ml #machinelearning #pifordtechnologies #generativeai #generativemodels
Рекомендации по теме
Комментарии
Автор

All i can say is that you did a lot of work about deep learning, and I would like to thank you

AllahKbiir
Автор

As title mentioned, Text-To-Image, where is the function take takes text input and returns image?

lol-kipd
Автор

after stage 1, our output images cannot display on test floder

dindhukannepalli
Автор

Ma'am, where are you providing the text input in this code? Can you please reply ASAP.

shubhamksah
Автор

no text-to-image interface . misleading title

adityakantipudi
Автор

Superb mam. Please keep continue making videos on different Gan thank you for your good content videos mam. I keep following all your videos

pramodhbr
Автор

Hi Ma'am, I'm trying to run this and max epochs I can run my model with is 35. after that google colab limit is exceeding. and for my local system It's taking very long time to run even one epochs. Is there any way I can solve this problem? and if not then where can I find trained model for 500 epochs for this? Please reply Ma'am.

MeetJani
Автор

Its great. I learned a lot from your explanation. Can you tell me more about the class_id? I mean what is its role? And where are you using it in the models? I tried to find out but I couldn't understand that.

abuzarrezaee
Автор

Madam, If I want to transfer styles based on text conditions for videos, can I use this model with CLIP

World-umvo
Автор

After training what changes we have to do with our model and where to provide input text in code and where output will show? example a car with red colour is input text and output will a image of red colour car.

nikhilsuryvanshi
Автор

Mengikuti terus ...salam dari Indonesia

MrEri
Автор

I am getting this error - Kernel Restarting - The kernel appears to have died. It will restart automatically. Anyone knows how to fix this error ? I am running on jupyter. I have python 3.10 and tensorflow 2.10. Have created a new environment and installed all the packages required for this project.

sangramchincholkar
Автор

Your Video is amazing maam could you also create a video about Stable Diffusion

RadRebel
Автор

Thanks a million dear ma'am, you saved my life.

nafassaadat
Автор

Where in the code are we giving the text input?

ssp.
Автор

thank you mam its very helpful for my project

sadiqpurki
Автор

Tried training with colab pro, but after 85 epochs, it is using RAM above 26gb and getting crashed.

siddharthm
Автор

i am getting an error while training stage 1 the error is "Data cardinality is ambiguous:
x sizes: 64, 1
y sizes: 64
Make sure all arrays contain the same number of samples."
this error is shown on the discriminator loss
Any help would be highly appericieted

saugatneupane
Автор

Hi Aarohi, can you please tell me how you can train our own dataset to get a char rnn-cnn text embedding. I have searched everywhere .but I'm unable to find a solution. Also is it possible to use any sentence level embedding technique. If so can you please suggest me an approach. Eagerly awaiting for your reply. Thanks

aarthifnawaz
Автор

"This bird folder is a folder" - Aarohi

ddtechnologies