DALLE-2 has a secret language!? | Theories and explanations

preview_player
Показать описание
DALLE-2 has a secret language? No, it’s rather a secret vocabulary. Let’s see what happens and why the model behaves like this.

Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏

Outline:
00:00 DALL-E 2 has a secret vocabulary
01:05 Weights&Biases (Sponsor)
02:34 How DALL-E 2 responds to gibberish
05:00 Why does this happen?
07:59 Security implications (adversarial attacks)

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🔗 Links:

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

Music 🎵 : Built to Last (Instrumental) - NEFFEX
Video editing: Nils Trost
Рекомендации по теме
Комментарии
Автор

Well, the neural network HAS TO map any input to an image, so random strings would have to mean something

zahar
Автор

Some experiments I read about today suggest a theory that the VAE or autoencoder is lossy and can make text illegible that removes lots of details even when just upscaling the same image that is input. The result of using ONLY a vae looked eerily like what happens to text jumble on these models of all sizes. So I wonder also if it could be that the text encoder, if big enough, just figures out the text or, if where there is no text, the text encoder can impute text better. But where guidance is needed or with smaller text-encoders, it cannot make sense of the scrambled eggs of lossy autoencoders. Maybe, just a thought to think about???

salomeshunamon
Автор

Great analysis and also impressed with your nonchalant pronunciation of the gibberish.

stevenmitchell
Автор

I like the silly words that Dall-E2 comes up with 🤣

DerPylz
Автор

I would LOVE Dalle-2 language classes lol
I have noticed talking with a friend that all the gibberish sounded somewhat like... latin-esque. Makes pretty much sense the AI is relying on taxonomics, which is always grammatically treated as Latin (using the cases structure etc), but can be derived from pretty much any language (which actually is possibly one of our own practices in recording endagered languages).
I've played a lot with this very friend trying to understand the ways the AIs make to aproximate meaning for word from context and frequency analysis. Surfing in the Wordle wave we've been playing on Pimantle, which traces the 2d map for google's word2vec algorithm, and it's frequently frustrating how winning the game is not always about thinking in meaning but in morphology of the words, or how sometimes very generic or ultra-specific far-fetched words score high in proximity of the secret word.

victorplacidorangel
Автор

That's quite interesting! I tried to use the same method to get the GLIDE's vocabulary, but the text that it outputs is unreadable. Instead, I've just tried putting in random gibberish into the GLIDE model. It seems that "lkjnf" means airplanes (GLIDE almost always outputs images of planes) and "dnfnfjwnv" means some kind of ambulance (GLIDE almost always outputs vehicles that have sirens and checkered patterns that resemble ambulances).

However, when I try puting in "a red dnfnfjwnv", it outputs images of red trucks. I'm guessing the vector representation of the word "dnfnfjwnv" is just close to that of "vehicle".

linkanjarad
Автор

I'm wondering if this type of behavior could happen also inside biological networks at deep levels, and there are "protection" layers that prevent that external stimuli would trigger strange memory recollections or strange language associations.

Micetticat
Автор

I think what Joscha Bach is trying to say is that Dalle-2 might be good at associating one concept to a word but not multiple concepts at once, (ie. not a language just vocabs). :D
Though the language isn't fully gibberish after all, the effect is still very interesting, like accidentally stepping inside another alien dimension.

giantbee
Автор

What's interesting to me is that when you pronounce those gibberish words, my brain just assumes you are talking in your native language, wheras if someone who lives locally were to pronounce them in an accent I'm more familiar with I'd recognize them as gibberish.

I don't know if that's because my brain doesn't have enough training data for different accents or if yours as more (as people in other continents tend to have a broader experience with other languages than most people on North America - I'm in Canada).

justinwhite