Stable Diffusion XL Is Here!

preview_player
Показать описание


My latest paper on simulations that look almost like reality is available for free here:

Or this is the orig. Nature Physics link with clickable citations:

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Kenneth Davis, Klaus Busse, Kyle Davis, Lukas Biewald, Martin, Matthew Valle, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.

Károly Zsolnai-Fehér's links:
Рекомендации по теме
Комментарии
Автор

Thank you for covering developments in AI so concisely! What a time to be alive, indeed!

bijectivity
Автор

The open-source community needs to keep this level of growth going. We can not, at any level, allow big companies to have total control over AI. The collective minds of the internet can and should do this better.

xero
Автор

On a side note: The craziest part about Gustave Doré is that most of his black and white art is carved into wood, not painted. This has almost nothing to do with the video, but the audience deserves to know this.

flinkstiff
Автор

I always feel so fine to be called a Fellow Scholar on this channel!
Just like being called a Wonderful Person on Antons channel.

jooei
Автор

I've been using Stable Diffusion to generate concept art for a play i'm working on. It's amazing to be able to see high quality renderings of a play that doesn't exist

doomsdayman
Автор

I love that you don't rush these out, and you carefully approach these in the right time.

vgaggia
Автор

I still cannot get over how well these image models can mimic caustics and other light effects

PookDaWook
Автор

My big issue with using fewer words for "better" results is that I want very specific things and to change only small parts I don't like. I like the grocery list way of entering in prompts because I have more control.

noobandfriends
Автор

Actually played around with this a few weeks ago, highly recommend using comfyUI instead of 1111 - because it's made for comfyUI and you'll need the additional control over the workflow. Highly recommend getting a SDXL safetensor from huggingface or civitai as the base models training data is decent but fairly bare bones.

It's good but only for landscapes and close ups, it really can't handle small details because of the static meshing.

It does generally require less inputs, but that's a bit of a double edged sword as when you do try to get a more specific output with more inputs it quickly starts to burn the latents - even with fairly low weights. It's best to stick to a only a few inputs and just *hope* that it guesses the rest correctly, which it can do well as long as you're sticking to realistic scenarios.


That being said here's a basic rundown of how it works;

First it generates a latent with extreme amounts of static, basically allowing the larger model set it uses to use only a bit more VRAM than SD1.5 otherwise uses.

Then the static latent is pushed through *another diffuser* that's basically *trained* on the static meshing (the refiner) which removes the static to the best it of its abilities, in a fairly similar fashion to how most upscalers do. Hence why small details (eyes, faces, e.c.t.) *in a large area* don't come out very well, but do work spectacularly when such parts are done in close up - less pixels of information of the area for the refiner to work with the worse the output result, just like an upscaler.




I was going to experiment with tiling to help deal with the issues it has with small objects, but at the time no one's posted any inpainting versions and any inpainting model of SDXL is going to have to be aware of the static and insure it inpaints correctly because otherwise the refiner will not be able to refine the edges between the tiles correctly - or just wont refine the inpaint area at all, as it seems to currently do.

BitShadow
Автор

Yoooo this is DOPE! I love how much better this is! Just tried the clipdrop version and oh boy that's a treat!

miranda.cooper
Автор

Welp, rule 34 is going to be stronger than ever.

blazingfuryoffire
Автор

3:05 I wish this'd never get fixed
Jumbled text is so unreasonably hilarious to me

hundvd_
Автор

That nuclear cooling tower to water kettle transition though

tdoge
Автор

That shoe to mountain range example BLEW my mind!!

hellfiresiayan
Автор

> “Two Minute Papers”
> Six Minute Video

randomcookieboy
Автор

if more people had lobster claws instead of hands, it would be less problematic

xl
Автор

My experience with SDXL so far:
Definitely better than base SD1.5, but as of yet not much better than 1.5 derived community models.
Many of the popular SDXL community models appear severely over trained at the moment, and therefore tend to have poor variety in outputs compared to their 1.5 counterparts.
Low CFG scale doesn't doesn't seem to promote creativity nearly as well as it does for 1.5 models. Might be a symptom of overfitting. (I've mostly messed around with community models.)
SDXL responds far more to high iteration counts than 1.5 models. I've seen meaningful quality improvements far beyond 50 iterations for some prompts on some models, and 20 iterations tends to turn out a little too rough.
Sometimes faces get horribly garbled for no apparent reason, and other times they turn out absolutely beautiful. I still haven't figured out why.
SDXL does not handle low resolutions well at all! There's little reason to go below 1024 x 1024 for the initial generation.
As expected, SDXL is great for upscaling. It doesn't get confused as quickly as 1.5 models.
I miss my lora collection 😢

I see great potential in SDXL. At the moment I prefer 1.5 with its ludicrous selection of excellent community models, textual inversions and loras, but I figure the SDXL ecosystem just needs a month or two to mature before we really get to see what it is capable of.

fnorgen
Автор

carefull! seting up stable diffusion XL locally can lead to worse results because it needs a very specific configuration (software stack and parameters) to get the best results possible. I think you might have done that here. Apart from that great video as always

johanavril
Автор

It would be really helpful if Clipdrop's re-imagine feature had a text prompt so that you could guide the AI in the direction you wanted the reimaging to go, rather than it just being random.

armartin
Автор

I can't wait for SDXL to go Open Source

PcGamer