The Insane Hardware Behind ChatGPT

preview_player
Показать описание

Find out what makes ChatGPT work.

Leave a reply with your requests for future episodes.

FOLLOW US ELSEWHERE
---------------------------------------------------
Рекомендации по теме
Комментарии
Автор

Boys, put this in your calendar. We've just witnessed the first GPU that can't run Doom.

itsapersonn
Автор

You missed the most important spec of the A100/H100: it's memory. 80GB at 2TB/s, twice as fast as the 4090. VRAM capacity and bandwidth are what really matters in these AI/HPC workloads. More capacity means a larger model fits, and performance is proportional to bandwidth rather than TFlops/s.
PS: Comparing sparse matrix FP16 TFlops/s of one card to general FP32 TFlops of another card is bogus too. Quality on LTT channels has really suffered recently.

ProjectPhysX
Автор

I work for the company that builds and maintains these servers for Microsoft and it is absurd how crazy the H100s are compared to the A100s. Just the power projects alone cost millions of dollars per site for the upgrade.

DewittOralee
Автор

Guys, if you get a key detail wrong 4-6 times in a video, an asterisk correction isn't enough, do a retape.

CRossEsk
Автор

LTT should start doing stats for how many texts overlay corrections are done in videos per month where they said something incorrect that had to be fixed with a text overlay.

spidersj
Автор

Correction on the A100, the 10.000 version is the 40GB model, the 80GB model tends to go for double and thats the one AI people actually like using.

Henk
Автор

Interestingly, the limiting factor for LLMs (and most ML models running on these systems) is actually now memory bandwidth. Utilizing >33% of the raw FLOPS is considered good and more than 50% is great. (And that's even with the insane caches and memory bandwidth.)

JeremyOrlow
Автор

Loved the video. When the script is wrong, why don't you reshoot the sections with mistakes? Just curious.

EndoBaggins
Автор

James: "4 times"
Editor: "It's actually 8"
James: "6 times"
Editor: "It's actually 3"

Me: "... So Jame's math eventually checks out?

tiaxanderson
Автор

How did I manage to catch a LTT video as soon as it posted?

StolenJoker
Автор

I like how you say 'flat' 1:05

CharlesTheClumsy
Автор

True revolution will come with *de-centralized, local* LLMs.
They will be yours.
They will have long-term memory.
The time you spend with them is significant part of their later training (pre trained to do basic stuff, but personality develops with interaction with you).

Like Jarvis from Iron Man.

foxtrotunit
Автор

Technically, yes, you need more processing power to train the model, if you're comparing 1 to 1 (training vs trainned), usually after the model is trainned it's way easier to run, but as the demand increases, it'll surpass, as explained in the video.

Автор

When you get the majority of the facts incorrect and need an on-screen note placed during editing for correction several times, there comes a point when you should probably reshoot the video

NeilD
Автор

This is why the Microsoft Co-pilot subscription will cost $30 a month.

Zedilt
Автор

I’d love an episode on how game devs make game saves work. Ie: what does the data look like that tells the game how far you are and what you have?

spencercharles
Автор

James must be using chat GPT to do his math.

BlackHoleForge
Автор

Can you do one about the insane underpaid manpower behind it's content filtering?

rplf
Автор

Maybe GPU is not a name that fits it's entire function anymore

TheGroselha
Автор

i think a good topic for tech quickie to discuss is jobs in the industry, as most people just think of ICT or engineering.

Broski