Breaking Down Meta's Billion Dollar LLM Blueprint [Llama-3.1 Full Breakdown]

Показать описание

check out my newsletter:

Llama-3.1's 92 page paper is an engineering paper that most people wouldn't nearly care as much, but would be seen as the goldmine paper of LLM for any AI developers. Why is that? Let's find out what Meta researchers shared how an chungus of a model is trained and optimized.

Llama-3.1 405B

This video is supported by the kind Patrons & YouTube Members:
🙏Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Robert Zawiasa, Owen Ingraham, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Penumbraa, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi, Hector, Drexon, Claxvii 177th, Inferencer, Michael Brenner, Akkusativ, Oleg Wock, FantomBloth, Thipok Tham, Clayton Ford

[Music 1] massobeats - gingersweet
[Music 2] massobeats - lush

[Video Editor] Silas

0:00 Intro
2:52 model architecture
4:50 scaling law
6:39 compute & hardware optimization
10:24 Poe
11:48 Training recipe
17:49 Data mix

Рекомендации по теме

Комментарии

and probs no more 20 mins vid from me it's literally death itself to record it

bycloudAI

this video really wanna makes me read the whole paper, rare to see a company publish such a detailed paper

erenplayzmc

A "multimodal" chatbot:
5 different models hot glued together

Napert

Karpathy in 5 years: Reproducing LLaMa 3.1 405B

YourAverageHuman-

54 days training and it reached GPT-4o 🤯
GPT-5 with X-trillion parameters is going to start it's own weight class of LLMs 😌

RedOneM

So glad this answered more questions than I ever thought even exist.

FunIsGoingOn

Wow that's one of epic tutorial Llama 3 Training RitualDifficulty: Deadhead
Rarity: Mythic
Minimum Level to Read Description: 80
Minimum Level to Embark: XXX (requires further enlightenment)

apoage

It's actually pretty cool that Poe sponsors you. They genuinely are what I recommend to anyone who wants to use LLM's.

pareak

i'm mad excited for llama 4 because multimodal

GraveUypo

06:08 The isoflops curve explanation was a mind-bender! Thanks for breaking it down.

The.AiSide

First time that an advertisement actually makes me return to a video and watch it again to find it.

Regardless of that, this was super helpful, thank you so much.😅

RicardoPoleo

new video dropped... * breathing heavy *

diga

It was an excellent video, but still I don't think the kids from 3:00 are gonna make it.

Hodoss

Great video, I'd love to see more of that. Even some more technical and also about multimodel models architecture

elwii

It's clear to me that llama4 will have MoA like GPT4o. It would be nice to see an image generator also integrated but let's not get ahead of ourselves. Let's hope that it would also be "open source" (although the current models aren't technically open source because you're not completely free do do whatever you want with this technology. Look it up)

dimii