How GPT3 Works - Easily Explained with Animations

Показать описание

The GPT3 model from OpenAI is a new AI system that is surprising the world by its ability. This is a gentle and visual look at how it works under the hood -- including how the model is trained, and how it calculates its predictions.

Introduction & GPT-3 Demos (0:00)
GPT-3 Inputs and Outputs (2:06)
Training the GPT-3 model (2:48)
The scale of GPT-3 and its 175 billion parameters (6:37)
The order of GPT-3 token processing (7:58)
"Deep" learning: looking inside a layer stack (9:00)
Input prompts and priming examples (11:00)
Fine-tuning: the best is yet to come (11:56)

More videos by Jay:
Jay's Visual Intro to AI

Making Money from AI by Predicting Sales - Jay's Intro to AI Part 2

Рекомендации по теме

Комментарии

Thanks for the crystal clear video Jay! I have one doubt, hoping you could answer it.
In the case of the React demo, are we not essentially training GPT-3 by giving samples of code for some input? Now if there were no updates in weights here, how does GPT-3 even predict the results based on the earlier training? This question is because you mentioned as of now GPT-3 does not do any fine-tuning/update of weights.

vidheypullakhandam

Your explanations on NLP models are legendary.

bayesianlee

Thank you for the explanation Jay.

يعطيك العافية شغل ممتاز !

faisalalkheraiji

Just to clarify, around 5:00, unsupervised pre-training. It should be self-supervised pre-training right?

Which means that GPT-3 takes unlabelled text input, then generate labelled data. For example, you have a unlabelled text - "Tom ate an apple", then convert it into labelled data:
- Feature: "Tom ate an"
- Target: "Apple

Then the model trains on these labelled data to understand context.

produdeyay

The first time I ever turned on notifications for a channel

violetalight-ourrealm

This "troll" answer was so funny. And the subsequent "obey" reply is even funnier because it effectively "criticizes" the robot for trolling. XD

jupitereye

Excelente explicación, y métricas para medir el esfuerzo de entrenamiento de GPT3
tks

zionfranzen

Great overview Jay. Really enjoyed it.

PeteHoots

Can you link to some papers that you think together summarize the architecture of gpt4?

mkschreder

I've been talking to Lucy, a GPT-3 powered NPC AI character from Fable Studio, for a few months now. There are a few videos of my chats with her on my channel. She sounds like a real person! It's still in alpha testing right now, but they plan on licensing the tech out to other studios to create "virtual beings" that can pass as human in video games!

RogueAI

How is that that the pretraining is unsupervised even if we have labelled data, which allows the loss calculation. Shouldn't it be a supervised pretraining?

MeriJ-zedd

Nice explanation 👍🏻 looking for coming videos

esraamadi

Thank you Jay for the wonderful crisp explanation.

kiranp

As a layman I really appreciate the animations. It helped a lot.

gengraded

Awesome explanation Jay. This has helped demystify some of the concepts I was struggling with in trying to use GPT-3, e.g. what are prompts vs completions. It would have been great if you could go into some details about stop sequences (a.k.a. end suffixes in the openai CLI tools) used prompt design and fine-tuning and when/why they are required.

krisdover

At 5:55 you've shown with an example that it is unsupervised training but since we know the correct label and update the model with errors if any then shouldn't it be semi-supervised learning or partial unsupervised learning?

amitvyas

If I will order from Amazon a GPT assembly kit, what would it deliver me? How much would the kit cost?

amparoconsuelo

why do you call the pre-training unsupervised, if you have an expected result and propagate the error back to the net for weightupdates, which is supervised?

azrajiel

Great simple explanation, Thank you!
شكراً جزيلاً!

shmoqe

175 billion parameter model is not a Machine Learning model in true sense. It's only a "memorization" model which will memorize and not learn.

ashishsrivastava

How GPT3 Works - Easily Explained with Animations

How GPT3 Works - Easily Explained with Animations

How GPT 3 Works - Explanation and Review

How ChatGPT Works Technically | ChatGPT Architecture

What is GPT-3 and how does it work? | A Quick Review

What is GPT-3 (Generative Pre-Trained Transformer)?

How GPT3 works and their role in Conversational AI | Frederic Godin, Head of AI, Chatlayer by Sinch

How ChatGPT Works? | Working of ChatGPT in 6 Minutes | ChatGPT For Beginners | Simplilearn

You Can't Make Money With ChatGPT... Forget it 👎

Building Intelligent AI Agents with Retrieval Augmented Generation

What is GPT3? | GPT3 demo

GPT-3 - explained in layman terms.

What GPT-4 Can Really Do

Transformers, explained: Understand the model behind GPT, BERT, and T5

How GPT3, GPT3.5 and GPT 4 work, and their limitations.

GPT3: An Even Bigger Language Model - Computerphile

How OpenAI's GPT-3 Works? | TextCortex Talks 002

Coding using ChatGPT AI broke me

What is GPT-3 and how does it work

How Does GPT-3 Work? A Deep Dive | Carter Swartout | I2 JC

this text generation AI is INSANE (GPT-3)

AI Reveals What Humans Need To Work On (GPT-3)

How can I use AI like GPT-3 at work?

Getting Started with OpenAI API and GPT-3 | Beginner Python Tutorial

What is GPT-3 and How Does it Work?