FAST Flux GGUF for low VRAM GPUs with Highest Quality. Installation, Tips & Performance Comparison.

Показать описание

We install the new GGUF node on ComfyUI locally for NVIDIA or AMD GPUs.
The image generation examples show both the great quality as well as the detailed performance, followed by tips & tricks including Flux.1 DEV and Flux.1 SCHNELL.

Videos:

Links:

The GGUF Models:
About Quantization:

PLEASE CHECK THE PINNED COMMENT FOR UPDATES !

Chapters:
0:00 About Flux and GGUF
1:02 GGUF Installation
2:54 ZLUDA Update
3:46 Adding GGUF Loader
4:33 GGUF Models and Test
6:43 Example Generation
7:45 Result Comparison
9:47 Performance Details
13:16 Key findings

#comfyui #flux #gguf #stablediffusion

Рекомендации по теме

Комментарии

How is your performance with the low-VRAM GGUF quantized models?
UPDATE 2: Speed increased x2 with flux1-DEV/SCHNELL-Q5_K_S.gguf in comparison to the original models (tested on AMD GPU). Important, you have to start with runtime parameter ‘--force-fp32’ and although this parameter speeds up the quantized models, it slows down the original ones! T5 text encoder model has currently zero influence on my machine's performance, so I choose T5xxl_fp16.
Many different sizes are available, choose Q5_K_M or larger. Place in your ‘clip’ folder.
Update the GGUF node (do a 'git pull' in this node's directory), most probably you have to update ComfyUI as described in my vid, too.
Replace the clip loader in your model with the new 'DualCLIPLoader (GGUF) to be found in 'Add Node->bootleg'.

NextTechandAI

I get "RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x3072 and 1320x18432)
mat1 and mat2 shapes cannot be multiplied (1x3072 and 1320x18432)" when trying to use these new GGUF models with Forge UI. Does it even work with Forge?

TheGalacticIndian

This use slightly less vram (q8 vs fp8) and (f16 vs fp16) but it's not faster, as long as you don't overpass your vram pool, the speed will remain the same. Same thing for schnell, q4 render in 4seconds on a 4090, just like fp16. Unified or "baked" BNB NF4 flux models was way faster to load, but not compatible with LoRA and now considered as Deprecated.

crypt_exe

I have nvidia 3060 6gb vram and am able to run flux-dev model easily which takes roughly 1 min 15 sec to generate with the the 6.2gb gguf file

vivekkarumudi

How much do you "lose" using these models? Like my GPU can handle the full dev model, but its slow. Would using these models be faster but at the same quality or do you lose noticeable quality? Also what do the different Q* files mean?

JohnVanderbeck

Great video. Bravo. I wanted to ask you if you can do an update using the new t5_v1.1-xxl GGUF
Thank you

Giorgio_Venturini

For a 12 GB 3080Ti what models do you recommend?

LX

I'm very new to this world and i'm learning quite a bit from your videos. One thing I don't quite fully understand, maybe you can help:
My system is: Radeon RX 7700XT (12gb vram) GPU / Ryzen 78000X3D / 64GB DDR5 .
I'm running comfyUI in Ubuntu, followed one of your great tutorials.

I just can't run any fp16 versions of the models because it says it's running out of memory.

So, does the the 16 in fp16 there means 16gb vram required?
Can I somehow leverage my big pc ram size for something?
Is the CPU useful for anything in those scenarios?

BrunoOrsolon

3060ti. Took 10 minutes to finish 8 image on flux. 😂😂😂😢😢

ericcheah

NF4 > GGUF. GGUF is slower due to being compressed and NF4 was optimized for speed. As a trainer I wish either I could use to train with and this is hideously slow being forced to BS1 on a 4090.

generalawareness

hi! im lost in the Zluda step, im not using comfyui portable, i have cloned comfyui-zluda from patientx, in wich folder need to pip install gguf? thx for ur videos

luxiland

But I only have a black image as a result.Anything I did bad ? Please help me adv thanks

tamizhanVanmam

To uninstall or remove nf4, do I just delete the ComfyUI_bitsandbytes_NF4 folder?

DarwinsGreatestHits

im on AMD, using Zluda and ComfyUI is up to date and i can see flux support in the patch notes inside comfy manager but it cannot get the dualclip into flux mode is there an extra step required that i could have missed ?

Gwaboo

Which model do you recommend for Rtx 4070ti 12gb v ram?

lowaura

Hi I'm new to all this - I don't see bootleg within the add node dropdown? Little help... Ah I got it the cmd didn't install on the first try for some reason :P

SamiHD

hi! have a 6700xt 12gb vram comfyui with zluda and stable diffusion 1.0 xl when put 1024 x 1024 crash the app, the terminal show a message CUDA out of memory tried to allocate 2.50 GiB sorry my question is not from this video T_T thx for all your excellent videos.

luxiland

Do you need the 23GB F16 files also? for scnel and dev

boltr

About 355 seconds running "flux1-dev-Q4_K_S" in ComfyUI on a Mac Studio (96GB/38-Core GPU). So, still unusable for me, but par for the course because Apple doesn't care about MPS and open source.

rmeta

I am struggling a bit with Flux. I have a GeForce 3080 Ti, which is nothing to be scoffed at, and driver version 560 is installed on Windows. I tried a bunch of different workflows with dev FP8, and all of them are super slow. I only have 64 gigs of DDR5 RAM. But I haven't read anywhere that it should be a problem.

shushens

FAST Flux GGUF for low VRAM GPUs with Highest Quality. Installation, Tips & Performance Comparison.

FAST Flux GGUF for low VRAM GPUs with Highest Quality. Installation, Tips & Performance Comparis...

FLUX +GGUF: Macbook run FLUX locally reducing RAM requirement using GGUF - step by step guide

Flux Dev Models Explained For Webui Forge | Low VRAM GPU Options

NEW Flux NF4 for ComfyUI. FASTEST FLUX!

💥IS IT REAL?! Flux controlnet Canny in ComfyUi installation even for low GPU's with fast speed!...

How To Run Flux Dev & Schnell GGUF Image Models With LoRAs Using ComfyUI - Workflow Included

Flux Dev/Schnell GGUF Models - Maximize Performance, Minimize VRAM

Install Flux locally in ComfyUi with Low VRam

GGUF FLUX Comfyui Boosting Your Workflow with Quantized Models

Running Flux.1 on Low VRAM: A Practical Guide

This New Version of FLUX Runs on 6GB VRAM and is 35% Faster - Flux NF4 Install Guide

How to Use Flux GGUF Files in ComfyUI

Flux GGUF models are our favorite!

How to Run Flux Image Models In ComfyUI with Low VRAM

How to Pick the Best Flux Models for YOU and Use in Forge UI

Unlock FLUX Full Potential on Any GRAPHICS CARD (ComfyUI)!

How to Unlock Faster Image Generation with Hyper Flux Lora! Easy 8-Step Method Revealed!

Fast Flux Upscaling With Hyper-SD and Latent Multiply in ComfyUI

FLUX Controlnet EASY Workflow for Comfyui GGUF models

ComfyUI tutorial: How to use Controlnet all in one using Flux GGUF model #comfyui #flux #controlnet

'The comfyUi Secret of Upscaling with Flux model in One Step!'

FLUX V1 GGUF model in ComfyUI AI, for 8GB VRAM / GPU Ram / small VRAM

3 Ways to Harness Flux with ComfyUI: High-End to Low VRAM Solutions with GGUF & Bonus Tools!

FLUX with FACE ENHANCER 🔥 - Using ComfyUI locally (With Low VRAM and LORA Support)