But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

Показать описание

Breaking down how Large Language Models work

---

Here are a few other relevant resources

Build a GPT from scratch, by Andrej Karpathy

If you want a conceptual understanding of language models from the ground up, @vcubingx just started a short series of videos on the topic:

If you're interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources.

Site with exercises related to ML programming and GPTs

History of language models by Brit Cruise, @ArtOfTheProblem

An early paper on how directions in embedding spaces have meaning:

---

Timestamps

0:00 - Predict, sample, repeat
3:03 - Inside a transformer
6:36 - Chapter layout
7:20 - The premise of Deep Learning
12:27 - Word embeddings
18:25 - Embeddings beyond words
20:22 - Unembedding
22:22 - Softmax with temperature
26:03 - Up next

3Blue1Brown

Рекомендации по теме

Комментарии

I graduated from Computer Science in 2017. Back then, the cutting edge of ML were Recurrent Neural Networks, in which I based my thesis. This video (and I'm sure the rest of this series) just allowed me to catch up to years of advancements in so little time.

I cannot describe how important your teaching style is to the world. I've been reading articles, blogs, papers on embeddings and these topics for years now and I never got it quite like I got it today. In less than 30 minutes.

Imagine a world in which every teacher taught like you. We would save millions and millions of man hours every hour.

You truly have something special with this channel and I can only wish more people started imitating you with the same level of quality and care. If only this became the standard. You'd deserve a Noble Prize for propelling the next thoustand Nobel Prizes.

iau

Grant casually uploading the best video on Transformers on YouTube

DynestiGTI

The fact that meaning behind tokens is embedded into this 12000 dimensional space, and you get relationships in terms of coordinates and direction, that exists across topics is mind blowing. Like, Japan —> sushi is similar to Germany —> bratwurst is just so darn neat

tempo

This is heaven for visual learners. Animations are correlated smoothly with the intended learning point ...

lewebusl

I was trying to understand chatGPT through videos and texts on the Internet. I always said: I wish 3b1b releases a video about it, it's the only way for someone inexperienced to understand, and here it is. Thank you very much for your contributions to youtube!!

billbill

The return of the legend! This series is continuing, that is the best surprise of YouTube, thanks Grant, you have no idea how much the young population of academia is indebted to you.

Silent_Knife

2 years ago I started studying transformers, backpropagation and the attention mechanism. Your videos were a corner stone for my understanding of those concepts!
And now, partially thanks to you, I can say: “yeah, relatively smooth to understand”

lucasamadsen

I wish i had a friend as passionate as this channel is. It's like finding my family I've always wanted to have

Kargalagan

I don't even know how many times I'm going to rewatch this.

parenchyma

I have been working on transformers for the past few years and this is the greatest visualization of the underlying computation that I have seen. Your videos never disappoint!!

nicholaitukanov

Thank you! You're so late 3Blue1Brown, it took me 10 hours of videos + blogs last year to understand what a transformer is! This is the long waited video! I'm sending this to all my friends.

jerryanyu

You are such an AMAZING teacher. I feel like you've really given thought to the learners perception and are kind enough to take the time and address asides and gotchas while you meticulously build components and piece them together all with a very natural progression that's moving towards "something" (hopefully comprehension). Thank you so much for your time, effort, and the quality of your work.

ogginger

here's to hoping this is not an April fools

JustinLe

You *must* turn the linguistic vector math bit into a short. is pure gold.

chase_like_the_bank

It's absolutely ridiculous how many aspects of this topic finally clicked for me in this intro video already. This was incredibly well explained an I'm so thrilled for the next chapters. Thank you very much, Grant!

tielessin

Man! You never fail to enlighten, entertain, and inspire us, nor do we get enough of your high-quality, yet very digestible, content! Thank you, Grant!

jaafars.mahdawi

Its astonishing, amazing that this kind of info and explaination quality is available for free, this is way better than a University would explain it

yashizuko

Grant shows just how creative you can get with linear algebra. Who would have guessed language (?!) was within its reach?

Mutual_Information

The genius in what you do is taking complicated concepts and making them easy to digest. That's truly impressive!

mahdimoradkhani

Blown away by the elegance - both visually and conceptually - in which this extremely complicated topic was taught! I never comment but was moved to express my sincerest gratitude! Thank you for all the time put into these beautiful videos.

alyssachen

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

Transformers, explained: Understand the model behind GPT, BERT, and T5

The Difference between GPT-3.5 and GPT- 4 #openai #chatgpt

What is GPT4 and How You Can Use OpenAI GPT 4

How To Use Chat GPT by Open AI For Beginners

Let's build GPT: from scratch, in code, spelled out.

ChatGPT but it's in 2006

What GPT-4 Can Really Do

GPT-4o Mini and Zapier Complete Guide: What is GPT-4o Mini? OpenAI's New Multimodal AI Model Fa...

RouteLLM Tutorial - GPT4o Quality but 80% CHEAPER (More Important Than Anyone Realizes)

New ChatGPT Model is here and it’s GOOD - GPT-4o Mini Review

BERT and GPT in Language Models like ChatGPT or BLOOM | EASY Tutorial on Large Language Models LLM

Math problems with GPT-4o

GPT-4o Mini Arrives In Global IT Outage, But How ‘Mini’ Is Its Intelligence?

MBR and GPT Partition Tables

What is ChatGPT? OpenAI's Chat GPT Explained

How to Convert MBR to GPT During Windows 10/8/7 Installation

But what is a neural network? | Chapter 1, Deep learning

How to write an Essay Using Chat GPT (without getting caught!!!)

HOW TO CHECK IF A DISK/DRIVE IS MBR OR GPT

What is GPT-3 and how does it work? | A Quick Review

MBR and GPT Partition Tables

GPT-5: Everything You Need to Know So Far

New GPT-4o Mini is Here & More AI Use Cases