Pretraining vs Fine-tuning vs In-context Learning of LLM (GPT-x) EXPLAINED | Ultimate Guide ($)

preview_player
Показать описание
Pretraining & fine-tuning & in-context learning of LLM (like GPT-x, ChatGPT) EXPLAINED | The ultimate Guide including price brackets as an indication to absolutely identify your compute and financial resources when and how to train LLMs.

Simple explanation of the differences between Pretraining, Fine-tuning and ICL (in-context learning) a LLM, like GPT-3.5-turbo or ChatGPT.
The simplest explanation possible on this planet!
The ultimate guide for beginners to LLM!

#promptengineering
#ai
#generativeai
#naturallanguageprocessing
#chatgptexplained
Рекомендации по теме
Комментарии
Автор

Thank you for illustrating both the concepts and then doing the math on pre-training, fine-tuning and ICL!

JohnLangleyAkaDigeratus
Автор

Maybe I misunderstood but you seemed to imply that ICL persists between sessions and across users and that's not the case. It only exists in the conversation

CptAJbanned
Автор

Austria!?! Interesting. Guess you teach!? Where? Do you personally know Hochreiter Sepp? Are you a fan of Károly Zsolnai-Fehér too?
Thanks a ton for your effort!! And, ahh, yes, Colabs are great!

gue
Автор

the last piece of ICL for task QA, I thought those are fine tuning, no? I thought ICL is mostly about context in the prompt.

GenzhPuff
Автор

If the ICL affects the model responses for any user that will use it, how are the weights & biases parameters of the DNN not modified?

LiorHilel-RunAI
Автор

What about Lora? Which is also technically fine tuning but at a fraction of the cost.

robinmountford
Автор

@code_your_own_AI - thank you for this video, I think the analogy is super helpful, though one key thing is that when you learn on the job you retain that knowledge while ICL does not affect the model

joeybaruch
Автор

Can you continue training an already trained model ? I.e. you start from a gpt 3.5 and continue training it with your custom raw data ? Or is it that once a model has been trained, it's no longer retainable anymore ?

gobbledee
Автор

So that was fine tuning before PEFT/LoRA

LjaDjXQKeymSDxh
Автор

Can ICL be for something like ai streamers that learn from their stream chat? Even though there's no correct answer for how they should respond, there is the stream chat's response which could be used for feedback?

alecalb
Автор

Might be a bit of a stupid question, ………… but : in the depicted way, what is the difference between ICL and RLHF? Or is ICL just a subset of RLHF ?

just..someone
Автор

hi do you know any good papers for icl that describe this topic in more detail?

jonasjohnsen
Автор

Is zero-shot and few shot learning part of ICL, depending on how many examples we give?

kislaya
Автор

how to inject company data into LLM memory without using vector database?

alok
Автор

8:15 Could you please provide examples of public LLMs which learn from ICL (in the case where future users can also benefit from your input data)? So far I've only seen ICL help with the *current* prompt, but not seen cases where it helps with *future* prompts of other users.

JuanUys
Автор

But does previous context persist into different sessions under ICL?

wryltxw
Автор

I didn’t realize that you get charged for converting your training data into embeddings. Can we use open source?

wryltxw
Автор

Is it necessary that in the fine-tuning process, all weights/parameters of transformer blocks are modified? I think some layers can be frozen. Correct me if I am wrong.

shubham_chime
Автор

Is it possible to do fine tuning simply using open source options?

wryltxw
Автор

What about fine-tuning with your own embeddings?

hablalabiblia