Developing an LLM: Building, Training, Finetuning

Показать описание

REFERENCES:

DESCRIPTION:
This video provides an overview of the three stages of developing an LLM: Building, Training, and Finetuning. The focus is on explaining how LLMs work by describing how each step works.

---

---

---

OUTLINE:

00:00 – Using LLMs
02:50 – The stages of developing an LLM
05:26 – The dataset
10:15 – Generating multi-word outputs
12:30 – Tokenization
15:35 – Pretraining datasets
21:53 – LLM architecture
27:20 – Pretraining
35:21 – Classification finetuning
39:48 – Instruction finetuning
43:06 – Preference finetuning
46:04 – Evaluating LLMs
53:59 – Pretraining & finetuning rules of thumb

Рекомендации по теме

Комментарии

Your articles and videos have been extremely helpful in understanding how LLMs are built. Building LLM from Scratch and Q and AI are resources that I am presently reading and they provide a hands-on discourse on the conceptual understanding of LLMs. You, Andrej Karpathy and Jay Alammar are shining examples of how learning should be enabled. Thank you!

tusharganguli

You are the best! Thanks a lot for sharing your knowledge to the world.

adityasamalla

Thank you Sebastian for your awesome contributions. You're a big inspiration.

chineduezeofor

One of the best 60 minutes of my time. Really thankful for this..

kyokushinfighter

You are a true educator. Honored to be a contributor to one of your libraries.

admercs

I know you don't do many tutorials but personally I love theme especially from you!

JR-gylh

Thank you, Sir. Your lessons are beneficial for the community. Appreciate your hard work..!! 😊

haribhauhud

I am your fan, I have most of your books, thanks for this excellent video ! Another evaluation metric that I found interesting in another channel was to make the LLMs to play chess against each other 10 times.

guis

Very nice video, I liked it so much that I preordered your new book directly after watching it (to be fair I have read your blog for some time now).

tomhense

u r a LEGEND, luv ur work, thnx a ton for sharing!

bjugdbjk

What wonderful Tech Minds : { Sebastian Raschka, Yann LeCun, Andrej Karpathy, ...} who share their works and beautiful ideations for Mere mortal like me... Sebastian's teachings are so, so fundamental that takes fear off my clogged mind... 🙏
Although I am struggling to build LLMs for specific & niche areas, I am confidant of cracking them with great resources like : Build a Large Language Model (From Scratch)!!!

ZavierBanerjea

00:02 Three common ways of using large language models
02:39 Developing LLM involves building, pre-training, and fine-tuning.
07:11 LLM predicts the next token in the text
09:30 Training LLM involves sliding fixed size inputs over text data to create batches.
14:22 Byte pair encoding and sentence piece variations allow LLMs to handle unknown words
16:42 Training sets are increasing in size
21:09 Developing an LM involves architecture, pre-training, model evaluation, and fine-tuning.
23:14 The Transformer block is repeated multiple times in the architecture.
27:22 Pre-training creates the Foundation model for fine-tuning
29:28 Training LLMs typically done for one to two epochs
33:44 Pre-training is not usually necessary for adapting LLM for a certain task
35:51 Replace the output layer for efficient classification.
39:54 Classification fine-tuning is key for practical business tasks.
42:01 LLM instruction data set and preference tuning
45:58 Evaluating LLMs is crucial, with MML being a popular metric.
48:07 Multiple choice questions are not sufficient to measure an LM's performance
52:34 Comparing LLM models for performance evaluation
54:32 Continued pre-training is effective for instilling new knowledge in LLMs
58:28 Access slides on the website for more details

nithinma

Thanks for the detailed videos and articles. I want to ask if it's possible to create a customized tokenizer as an extension to existing ones for a custom dataset? Also, how do decoder-only models handle other tasks like summarization, and classification after fine-tuning without forgetting their causal pre-trained causal next token task?

moshoodolawale

Thanks for the great knowledge You are sharing <3

rachadlakis

Oh, my lord, my favourite machine learning author is a Liverpool fan.😎

haqiufreedeal

Hi, nice videos! One question for my understanding. When talking about embedding dimensions such as 1280 in "gpt2-large" do you mean the size of the number vector encoding the context of a single token or the number of input tokens? When comparing gpt2-large and Lama2 the number is the same for the ".. embeddings with 1280 tokens".

RobinSunCruiser

@16:37 when you say Llama was trained on 1T token, do you still mean there was 32K unique token ? because on your blog post you have "They also have a surprisingly large 151, 642 token vocabulary (for reference, Llama 2 uses a 32k vocabulary, and Llama 3.1 uses a 128k token vocabulary); as a rule of thumb, increasing the vocab size by 2x reduces the number of input tokens by 2x so the LLM can fit more tokens into the same input. Also it especially helps with multilingual data and coding to cover words outside the standard English vocabulary."

Xnaarkhoo

When is your whole book coming out ? Eagerly waiting 😅

sahilsharma

Great Video. Now that LLM is so powerful, will regular machine learning & deep learning slowly vanish?

KumR

Ich nehme stark an, dass Du Deutsch sprichst :). Wo kann man Dein Buch im Kindle (mobi oder f2b) Format finden? Danke & LG.

andreyc.

Developing an LLM: Building, Training, Finetuning

Developing an LLM: Building, Training, Finetuning

How to Build an LLM from Scratch | An Overview

Understanding the LLM Development Cycle: Building, Training, and Finetuning

LLM Explained | What is LLM

Creating an LLM from scratch 😅 #ai #largelanguagemodels

Building LLM Agents with Tool Use

Create Your Own LLM from Scratch Easily

How ChatGPT Works Technically | ChatGPT Architecture

Build Scalable Workflows for LLM Fine-Tuning - LLMOps Workshop

Fine Tuning LLM Models – Generative AI Course

Knowledge Graph Construction Demo from raw text using an LLM

How to Build, Evaluate, and Iterate on LLM Agents

Run Your Own LLM Locally: LLaMa, Mistral & More

Roadmap to Learn Generative AI(LLM's) In 2024 With Free Videos And Materials- Krish Naik

Q: How put 1000 PDFs into my LLM?

Prompt Engineering Tutorial – Master ChatGPT and LLM Responses

Using ChatGPT with YOUR OWN Data. This is magical. (LangChain OpenAI API)

Finetuning Open-Source LLMs

LLM Project | End to End Gen AI Project Using Langchain, OpenAI in Finance Domain

Build Your Own LLM from Scratch: Step by Step

End To End LLM Project Using LLAMA 2- Open Source LLM Model From Meta

LLM Project | End to End Gen AI Project Using LangChain, Google Palm In Ed-Tech Industry

Best Programming Languages #programming #coding #javascript

what it’s like to work at GOOGLE…