Style LoRA Training guide for Stable diffusion 1.5 and SDXL Concepts Results and Conclusion

preview_player
Показать описание
#stablediffusion #A1111 #AI #Lora #koyass #sd #sdxl #style #styletraining

also see realistic Character training for SD 1.5 and SDXL
you can download SD 1.5 model (regularized) from

This video is about taring a Style for stable diffusion 1.5 and SDXL with LoRA, its concepts, how it is different from object or character training in terms of data and training parameters, and how can we determine if our style is trained well or not, and what to expect from it.
00:00:00 introduction for style training includes results showcase
00:01:17 data preparation
00:02:16 folder preparation for Kohya training script
this includes class name, instance name
00:03:50 captioning images
00:06:08 Kohya ss LoRA training settings for SD 1.5 without regularization
00:10:43 initial review of results
00:11:38 Kohya ss LoRA training settings for SD 1.5 without regularization images
00:12:50 SDXL LoRA training parameters
00:17:07 testing results and comparisons in A1111
show xyz comparisons, different prompts, LoRA models, regularization and without

the principles explained here apply to any style regardless of what it was, art, clothing or anything else.

we will also see difference between using regularization images and without and determine which option produced the better results.

the style I am going to train is a simple black and white sketching style more like hand drawing, which is used in many illustrations.
now this style like many is already learned by SDXL, and could be produced by SD 1.5 with the right prompts, but using a LoRA can make it easier and straight forward.
Now regardless of how useful this style is, the principles are the same to create your own style LoRA.



Conclusion:
1- to learn a style we must have large number of different images from different classes that only have the style in common.
2- 100 and up to 400 images are good number for style training, the more the better.
3- lower number of repeats is very important 1 and up to 4 depending on how many images you have, for 400, 1 or 2 is more than enough...1600 steps worked for simple style, a lot more could be required for complex styles, this is different from one dataset to another.
4- captioning must include everything except the style details, style details must be removed from captions.
5- regularization improves results of a style just like with characters.
6- simple styles don't need more than 32 network dimension for SD 1.5 or 16 for SDXL, complex styles could require a lot more.
7- SDXL contains too many styles already, unlikely that you need to train any new art style!
8- Noise value 0.0357 might be useful for SDXL training in advanced settings.
9- --network_train_unet_only for SDXL didnt improve results in this example, better to test with and without for each dataset
10- Regularization is strongly recommended to increase model flexibility and quality, better to test with and without and choose the better option
11- style is successful if it runs well on weight 1, and could run at higher weights 1 and up to 2, if it doesnt learn the training data too, only the style itself, and can mix with other LoRAs without corrupting their output.
12- styles will affect the output to a certain degree despite how light it is.

other useful info about how stable diffusion works in general and some tips can be seen at
Beginners guide to stable diffusion in Automatic1111 at:
cloth/object training

Computer Specs:
Laptop: Legion 5 Pro
Processor :AMD Ryzen 7 5800H , 3201 Mhz
System RAM: 16.0 GB
Graphics GPU: NVIDIA GeForce RTX 3070 Laptop GPU 8GB
Рекомендации по теме
Комментарии
Автор

Thank you! A big help getting started with styles and enough info to play with the numbers and test things out.

sarahthecolorist
Автор

Hello! This was an extremely useful video! I'm currently a graduate student in ML and I'm working on training a model for style then using that in img2img. Do you think it's possible to use LoRA with img2img?

kitty
Автор

I'm aware about lora. Mmmmh...
I've yet to see a tuto of someone train a full SDXL checkpoint.

lefourbe
Автор

Thanks for this video!

Must all the training images share the same resolution and aspect ratio? I didn't really catch what happened when you chose to only square images. I think you said cropping is unnecessary.

I'm attempting to train on a set of med to high res images that have different aspect ratios. I think they are all at least 1024x1024, many are closer to 4k. I don't mind cropping a dozen or so images for training, but I don't understand how/why this is important.

workplaydie
Автор

hello, I'm happy to see another video, I was really wanting to learn about creating loras with styles... but I'm still breaking my head with the characters... I'm trying to create a character but between 1200 and 1500 steps, it starts to overfitting, but still haven't "learned" all the characteristics of the character, what can I do to learn more details without starting to overfit? thank you

RyokoChanGamer
Автор

Hello friend, is there going to be anything new on Lora training program for people? I haven't heard anything in a long time, maybe there are some new software chips

K-A_Z_A-K_S_URALA
Автор

So now we can train styles, object/subject... Now the question remains how we trains a design language. Ty making a refridgerator in the design of Volvo or Guchi for example!?

kallamamran
Автор

Hey! I see you're using runpod, could you tell us how much time it took to train the lora? to know more or less the amount of money needed. Also I want to train a new style wich is used in a videogame of a friend, and I have like more than 1000 images (characters, hand with phone, stuff like that but not backgrounds), how many epochs, repeats, and steps. And actually, if you could explain me better what the differences are between these 3. Nice video by the way and Thanks!

yhuna
Автор

I am confused about one thing. During captioning do we need to caption all the keywords that we want to train in a model or do we have to remove those tags if we want them to train in a model. Which one of them is correct?

Ziko
Автор

Hey could you please let me know where to get the regularization images?

ThShooter
Автор

Sorry but I don’t follow you. I think is a good video, but I don’t understand you very well…

ricardoc