How To Install And Use Kohya LoRA GUI / Web UI on RunPod IO With Stable Diffusion & Automatic1111

preview_player
Показать описание
How to install famous Kohya SS LoRA GUI on RunPod IO pods and do training on cloud seamlessly as in your PC. Then use Automatic1111 Web UI to generate images with your trained LoRA files. Everything is explained step by step and amazing resource GitHub file is provided with necessary commands. If you want to use Kohya's Stable Diffusion trainers on RunPod this tutorial is for that.

Source GitHub File ⤵️

Auto Installer Script ⤵️

Sign up RunPod ⤵️

Our Discord server ⤵️

If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰 ⤵️

Technology & Science: News, Tips, Tutorials, Tricks, Best Applications, Guides, Reviews ⤵️

Playlist of StableDiffusion Tutorials, Automatic1111 and Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img ⤵️

0:00 Introduction to how to install Kohya GUI on RunPod tutorial
0:20 Pick which RunPod server and template
1:20 Starting installation of Kohya LoRA on RunPod
3:42 How to start Kohya Web GUI after installation
4:16 How to download models on RunPod and start Kohya LoRA training
5:36 LoRA training parameters
6:57 Starting Kohya LoRA training on RunPod
7:46 Where are Kohya LoRA training checkpoints saved
8:05 How to use LoRA saved checkpoints on RunPod
8:29 How to use LoRA checkpoints in Automatic1111 Web UI
9:12 Noticing a very crucial mistake during training
10:59 Testing different checkpoints after fixing the previous training mistake
11:36 How to understand model overtraining
12:28 How to fix overtraining problem

Title: Install Kohya GUI on RunPod for LoRA Training: Step-by-Step Tutorial

Description:

Welcome to my comprehensive guide on how to install Kohya GUI on RunPod for LoRA training. I take you through each step, explaining clearly to ensure you can follow along with ease. This tutorial will help you set up a powerful development environment using an RTX 3090 GPU and a RunPod with 30GB RAM.

In this video, we will:

Deploy a community cloud with a specific template.
Edit template overrides and set the container disk.
Connect to JupyterLab and clone a GitHub repository.
Generate a new virtual environment and activate it.
Install Kohya on RunPod and handle common errors.
Set up and start the Kohya web UI on RunPod.
Execute a quick demonstration of training a realistic vision model.
Troubleshoot common errors during the training process.
Optimize the training process and improve training quality.
Navigate through our GitHub repository for further learning.
Remember, if you're unfamiliar with how to use Kohya or RunPod, I've included links to excellent tutorials in the video description.

Whether you're just getting started with Kohya, RunPod, or LoRA training, or looking to enhance your existing skills, this tutorial offers valuable insights.

Don't forget to like, share, and subscribe for more tutorials like this!

#StableDiffusion #Kohya #RunPod #LoRATraining #Tutorial #MachineLearning #AI
Рекомендации по теме
Комментарии
Автор

Please join discord, mention me and ask me any questions. I am open to private consulting with Patreon subscription.

SECourses
Автор

I've tried installing and reinstalling Kohya like 3 times without success. I found this video and for some reason it worked this time! I don't understand why or what I was doing wrong but thank you for making this easy to understand and follow along.

gregorycoleman
Автор

I love that you show the mistakes. It’s makes me confident in doing it myself, knowing that mistakes happen and there’s ways to fix it. Thank you!

hashir
Автор

English pronunciation tip- the e is silent for the past tense suffix "ed"; so copied is is not copy-ed, but copeed. Same for all "ed" past test.

horsedevoursy
Автор

Wish I saw this before 10 hours of trouble shooting last 2 days. Thank you!

willpulier
Автор

Thank you for the video and not cutting out the little mistake with the folder, it helps understand similar mistakes. Anyway, I'm wondering why you didn't caption. If you had captioned the variables in the training data set, you could have been able to get the training to ignore the backgrounds and such. @10:14 for example the first image caption could be: TRIGGER_WORD, photograph of a man, close-up portrait, shot from below, shot with wide angle lens, looking at viewer, wearing blue shirt, soft lighting, outdoors, stairs, garden, lighting pole, building, blue sky, clouds, summer ---- so basically I follow caption structure: <Globals> <Type/Perspective/"Of a..."> <Action Words> <Subject Descriptions> <Notable Details> <Background/Location> <Loose Associations>.. NB I specifically didn't caption your glasses or beard for example, as we want the training to learn you consistently with those. But if you had some training images perhaps clean shaven or without glasses then we would caption these. What do you think?

CBikeLondon
Автор

I was searching to do this exact thing since 2 days straight, and your videos comes right at the perfet moment 🙏

bgyzxik
Автор

i keep getting the error "Image folder path is missing" the nwhen i go to setup the folder it says FileNotFoundError: [Errno 2] No such file or directory: 'D:\\loraname\\image'

alexsomething
Автор

Very detailed guide as usual, thank you.
Now excuse the long comment but your guides have been pretty helpful in the past so I thought it's worth spending some time commenting ;)


Some minor observations:
- You don't *need* to install python3.10-tk with the venv activated, since apt-get installs it system-wide. It doesn't hurt to have the venv activated of course, it just makes no difference.
- With the current RunPod templates it doesn't really matter if you affect A1111's webui's venv, you can even take advantage of it (by not creating a new one) to reduce the use of disk and possibly the time it takes to install kohya_ss, since it already comes with the CUDA version of Torch (and even xformers). You can reset the venv by deleting it and rebooting (or copying /venv to /workspace/venv). The downside is of course you may have to reboot to run A1111's webui.
- I think it's preferable not to use the --share option on RunPod and instead add the HTTP port to the template's list of ports (which will then create a public URL automatically and include a link in RunPod's UI, which I find useful sometimes). It's not as important with kohya_ss since the webui is only used for a few minutes at a time, but with A1111's webui the gradio.live tunnels are sometimes unreliable, so I avoid them.

Also, while not the main subject of the video, a couple comments on the training itself:
- That 256 rank sounds extremely high to me. People (me included) are nowadays getting excellent results with ranks as low as 16 or even 8. I've also started downsizing LoRAs I download and most of them work about the same with a rank of 16 or so. The only obvious difference I see compared to my trainings is you don't use captions and I also use as high a batch size as I can, may one of those be "punishing" your trainings?
- I'd suggest setting up the sample images options, with a prompt and seed (option --d), so that you can check while it's ongoing how/if the training works and if at some point it starts overcooking the outputs, by keeping an eye on the sample images in the model/sample folder. Saved me from wasting time a few times on long trainings.

ToniCorvera
Автор

Not a good experience with runpod running into runpod saying the directory is not found, when it exists, unable to train

hceely
Автор

Hi Lora save training models on .json format and it wont load on SD help! like you have in 7:48 checkpoint name and number. safestensors instead my Lora save as .json the models is ipossible to work.

davedm
Автор

its not working with latest version of Stable diffusion 9.1.0 on Run pod

Rátgeber_India
Автор

How you get a checkpoint to work?
I placed the checkpoint in the correct path and renamed it to a .ckpt file.
I refreshed the checkpoints. It loaded and showed the checkpoint successfully added to list of checkpoints.
But when I try to choose the checkpoint it does not load it returns to default checkpoints v1.5 and v2

tradersjournal
Автор

" ERROR No data found. Please verify arguments (train_data_dir must be the parent of folders with images) / ". Please help me with this.

PrasanthK-lh
Автор

Hi, is there a way to use 8x GPUs in the same time with single A1111? And how much faster it will render? Cheerz!

aerografiaaerografia
Автор

What exactly is the difference between a concept and a class?
So, I'm curious what this actually does to the model.
I do captioning for training, how are concepts and classes different from this kind of captioning?
The only thing I think I've understood correctly about machine learning from your explanation is that you store your training in tokens and retrieve them later.

I've heard a lot about putting something like "dog" or "person" into a concept or class, but I don't know how that works for learning.

SUNG-bq
Автор

Hey there @SECourses, I would like to know if it's possible to train something more broad, like let's say: an art style?

Example: feed the model a bunch of arts from Michelangelo from a specific period of his life to create a model dedicated to replicate his style at this period.

CODX
Автор

Your solution for the Tinker error via venv works to serve the GUI. But for some reason when trying to train or "print train commands" from the GUI, the tinker error appears again. Even if I close-out and redownload the scripts in that venv. Any ideas here?

willpulier
Автор

I want to try this for clothes, should i only change class to outfit or clothes ?

ax
Автор

How do you use tensor board? It wants to load on local host. Is there a way to call tensor board with a share link? I would like to monitor the progress live rather than runpodctl the logs out to view them local... =]

moonusaco