This new type of illusion is really hard to make

preview_player
Показать описание

Generative AI can be used to make images that look different in different orientations or different arrangements of parts.

You can buy my books here:

You can support me on Patreon and get access to the exclusive Discord:

just like these amazing people:

Glenn Watson
Peter Turner
Joël van der Loo
Matthew Cocke
Mark Brouwer
Deneb

Рекомендации по теме
Комментарии
Автор

15:14 Bias and hallucination in the context of generative AI aren't simply human fallibilities, they're the mechanism by which it functions: you're handing an algorithm a block of random noise and hoping it has such strong biases that it can tell you exactly what the base image looked like even though there never was a base image.

alexholker
Автор

Hey Steve and Matt, thank you guys for featuring our research – it was a lot of fun working with you! I'm Ryan Burgert, the author of Diffusion Illusions - I'll try to answer as many questions as I can in the comments!

Neptutron
Автор

Okay, hear me out. THIS is AI art. Not people using AI to just generate whatever they put in a prompt. But actual human creativity and ingenuity using AI as a tool to create something which previously would have been extremely difficult, if not impossible. There are a lot of ethical and aesthetic problems with generative AI in its current state, but this is the first time I've seen something made with AI and thought "that's beautiful".

CriticalMonkey
Автор

Loved the Matt Parker jumpscare in the image sequence

Gakulon
Автор

The rabbit/duck illusion got a serious glow-up

PixelSodaCreates
Автор

So a person could do this too - rough outline sketch of penguin, of a giraffe; flip one, work out an average rough from both; flip one back, do more detail on both, flip one. Repeat till you're happy or you give up.

But some people just do it in their head - amazing!

dekumarademosater
Автор

The reason some text models struggle with counting the number of r characters in a word like strawberry is because they don't see the word, they receive a vector which was trained to represent the different meanings of the word when looked at through different filters, similar to these illusions, which is what attention QKV projections do (extracting information from the vector which is layered in there). Sometimes the vector would have managed to store information about a word such as spelling and rhyming which the model can use, but oftentimes not, it depends on chance with how often things appear in the training data. The model could count it if the word was split into individual letters with spaces between them, because each would encode into a unique vector.

APrettyGoodChannel
Автор

A Mould-Parker crossover video about double image illusions in which you create several of them and you didn't do one that morphed from Parker to Mould?

Hannah-vm
Автор

Salvador Dali has a painting which looks like a woman in a dress going through a door in some kind of cubic world. When you go to take a picture of it, it looks like a pixelated Abraham Lincoln

gtdfytv
Автор

Those blocks would sell really well in gift shops. Especially in Zoos.

davetech
Автор

Oh the overlap with mundane cryptography could be interesting. The order of words could be scrambled between two outputs.

The idea of synthesizing sound that says different things if you understand different languages is kinda horrifying.

Dialethian
Автор

4:30 Minor nit. I don’t think the token embedding is really embedding based on semantics. It’s embedding based on how humans have used tokens in our writing. Since we tend to use semantically similar tokens in linguistically similar ways, the embedding does tend to cluster semantically similar tokens near each other. But it will also cluster tokens that aren’t semantically similar, merely because they’re used in the same way linguistically. For example “the” and “his” will be near each other in the embedding space not because they’re similar in meaning, but because they’re interchangeable in many sentences.

truejim
Автор

Just don't forget that "but it eorks either way" means actually that scientists have tried I would assume thousands of ideas regarding the network architectures, hyperparameters etc. and only some ideas have worked so well that they allowed for the next step. Showcasing results is one thing, developing the models another. It's hard work.

eler
Автор

The idea of generating images by removing noise is just as crazy as LLMs that generate text by predicting the next word (these are gross simplifications, but that's basically what it is).

Tigrou
Автор

This wasn't a video about how diffusion models work and are trained... but you still managed to explain both better than the majority of videos on YT about the subject. Can you make a video explaining how you became so damn good at explaining things?

Oh, and this is the coolest application of image generators I've seen to date. Brilliant idea leveraging the intermediate diffusion steps to sneakily steer the result into multiple directions simultaneously!

etunimenisukunimeni
Автор

In the settings of automatic1111, you can enable a clipskip slider right up top next to your model, vae, etc. Very useful if you're playing around with CLIP, especially when you've got novel length prompts. Doesn't really help you understand how the vector spaces really work, but it does help you to pretend to understand how they work.

amarissimus
Автор

I'm a software engineer and a midjourney user, and I've watched maybe 50 - 100 videos on LLM and generative AI.

In 17 minutes you managed to provide the best simple explanation for how generative AI works with LLMs to produce images from prompts.

Steve, you should teach a paid course on this stuff.

extube
Автор

Stopping it half way is exactly how you would do it with physical media
Do a sketch, re orient, edit sketch, repeat

DangerDurians
Автор

1:01 poor rabbit being called trash by Steve

SnakeSolidPL
Автор

This is a really educational video on AI which _should_ help most people understand and realise that these LLM and diffusion models are not General AI (ie; "truly intelligent") and just simple mathematical models. I studied AI and ML long before LLMs became a thing and have always been aware of this but convincing people of it is very hard in a short timeframe.

TheClintonio