The moment we stopped understanding AI [AlexNet]

preview_player
Показать описание

Activation Atlas Posters!

Special thanks to the Patrons:
Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti

Welch Labs

References
AlexNet Paper

Carter, et al., "Activation Atlas", Distill, 2019.

`Olah, et al., "Feature Visualization", Distill, 2017.`

Templeton, et al., "Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet", Transformer Circuits Thread, 2024.

“Deep Visualization Toolbox" by Jason Yosinski video inspired many visuals:

Great LLM/GPT Intro paper

3B1Bs GPT Videos are excellent, as always:

Andrej Kerpathy's walkthrough is amazing:

Goodfellow’s Deep Learning Book

GPT-3 size, etc: Language Models are Few-Shot Learners, Brown et al, 2020.

GPT-4 training size etc, speculative:

Historical Neural Network Videos

Errata
1:40 should be: "word fragment is appended to the end of the original input". Thanks for Chris A for finding this one.
Рекомендации по теме
Комментарии
Автор

30 years ago, I used to work with an older guy who retired from IBM. I was barely out of high school, and he used to tell me that neural networks were going to change the world once people figured out how to train them properly. He didn't live to see his dream become reality unfortunately, but he was totally right.

EdgarVerona
Автор

Fun fact, the kernels used in vision models work pretty much the same way as how our retinas perceive objects. In a similar structure, our eyes have cells that perceive edges at certain angles, then as shapes, then as objects in increasing abstraction.

khanghoutan
Автор

"one way to think about this vector, is as a point in 4096 dimentional space"
give me a minute, I now gotta visualise a 4096 dimentional space in my head.

somnvm
Автор

I was working with deep neural networks at the university during the late 90s, the main issue that stopped all progress was the use of a kind of functions between layers (the sigmoid as activation function), this effectively stopped the learning backpropagating from the output layers and limiting how many layers you can use (the problem is called the vanishing gradient). Once people rediscovered ReLU (it was invented in the early 70s, I believe, but I think the inventor published it in Japanese, so it went unnoticed) deep neural networks became possible. High computation needs were only a problem if you wanted real time or low latency, those days we used to leaving the computer calculating during nighttime to get something next day.

ernestuz
Автор

That real-time kernel activation map was life-changing.
If, whilst editing these videos, you've ever questioned whether the vast amounts of effort are worth what amounts to a brief, 10s clip, just know that it's these moments which have truly stuck with me.

JustSayin
Автор

The visualization is what takes this video from good to fantastic. It's very evident you put a lot of effort into making this visually engaging, which is very didactic!

..
Автор

I've been in the field for 10 years and never had anyone describe this so clearly and visually. Brilliant, thank you!

hcy
Автор

I stopped understanding AI around the six minute mark.

Sam_Saraguy
Автор

Most people think AI is a brand new technology, while in reality there have been studies on Computer Neural Networks all the way back in the 1940s, that's insane.

samuelspace
Автор

2:17 - THIS symbolizes the difference between "AI generated content YouTube channels" and real human-beings YouTube channels, like yours.

millanferende
Автор

Computers not being fast enough to make a correct algorithm practically usable reminds me of Reed–Solomon error correcting codes. They were developed in 1960 but computers were too slow for them to be practical. They went unused until 1982 when they were used in Compact Discs after computers had become fast enough.

kellymoses
Автор

2:40 dude this single picture right here the way you described it this way literally was like the the thing that truly helps me understand how this all worked thank you

crownoffyre
Автор

Your visualisations helped a few concepts click for me around the layers and activations Ive struggled to understand for years. Thanks!

optiphonic_
Автор

A great learning experience i had was to deep dive into bitmap format and multiply greyscale images with 3x3, 5x5 arrays with simple patterns, ie all zero with a -1 in the middle. Different array patterns highlight edges or remove edges. it was a really eyeopening experience any software person should try that shows these fundamental operations. Great video.

beautifulsmall
Автор

Awesome video! Funny how the moment we stopped understanding AI also appears to be the moment it started working lol

michaelala
Автор

I've been studying AI for the past year and the first 2 minutes was the best explanation I have see of how Transformers and ChatGPT works so far. Ive studied everything from Andrew Ngs Coursera courses, to Andrej Karpathy and more. Thank you for this great video!

AlvingGarcia
Автор

I really appreciate how well you communicate non-verbally despite using very little A-roll. You're expressions are clear yet natural even while reading, enunciating and employing tone, and there's no fluff; you have a neutral point for your hands to signal that there's no gesture to pay attention to.

I couldn't find anything to critique in your vids if I tried and this seems particularly easy to overlook. Thanks for every absolute banger!

frostebyte
Автор

17:02 - Which mostly forgotten AI algorithms are there? We have Expert Systems, Neural Networks, Backtracking, Semantic Nets, Decision Theory, Fuzzy Logic, Temporal Difference, and that's about it.

totheknee
Автор

Amazing intro with scissor and carboards 👏

emrahe
Автор

1:04 the moment I stopped understanding this video

thicksteve
join shbcf.ru