AI Hardware, Explained.

preview_player
Показать описание
In 2011, Marc Andreessen said, “software is eating the world.” And in the last year, we’ve seen a new wave of generative AI, with some apps becoming some of the most swiftly adopted software products of all time.

In this first part of our three-part series – we explore the terminology and technology that is now the backbone of the AI models taking the world by storm. We explore what GPUs are, how they work, and the key players like Nvidia competing for chip dominance.

Look out for the rest of our series, where we dive even deeper; covering supply and demand mechanics, where open source plays a role, and of course… how much all of this truly costs!

Topics Covered:
00:00 – AI terminology and technology
03:54 – Chips, semiconductors, servers, and compute
05:07 – CPUs and GPUs
06:16 – Future architecture and performance
07:12 –The hardware ecosystem
09:20 – Software optimizations
11:45 –What do we expect for the future?
14:25 – Upcoming episodes on market dynamics and cost

Resources:


Рекомендации по теме
Комментарии
Автор

For a sneak peek into part 2 and 3, they're already live on our podcast feed! Animated explainers coming soon.

az
Автор

Timestamps:

00:00 – AI terminology and technology
03:54 – Chips, semiconductors, servers, and compute
05:07 – CPUs vs GPUs
06:16 – Future architecture and performance
07:12 –The hardware ecosystem
09:20 – Software optimizations
11:45 –What do we expect for the future?
14:25 – Sneak peek into the series

az
Автор

"To remain competitive, large companies must integrate AI into their supply chain management, optimizing logistics, reducing costs, and minimizing waste."

AnthatiKhasim-ie
Автор

This is highly informative and easy to understand. As an idiot, I really appreciate that a lot.

AlexHirschMusic
Автор

In the usual case of floating-point numbers being represented at 32-bit, is this why quantization for LLM models can be so much smaller at around 4-bit for ExLlama and making it so much easier to fit models inside the lower amounts of VRAM that consumer GPUs have?

Incredible video, interviewer ask really though provoking and relevant questions while the interviewee is extremely knowledgeable as well. It's broken down so well too!

Also, extremely grateful to a16z for supporting the The Bloke's work in LLM quantization! High quality quantization and simplified instructions makes LLMs so much easier to use for the average joe.

Thanks for creating this video.

Inclinant
Автор

Well done, very clean and clear. Love your simplicity

NarsingRaoschoolknot
Автор

An excellent primer for beginners in the field.

lnebres
Автор

Great video. Tip of the computation innovation

TINTUHD
Автор

Guido Appenzeller is speaking my language. the lithography of chips are shrinking while consuming lots of power. Parallel computing is definitely going to be widely adopted going forward. Risc-V might replace x86 architecture.

MatrixGamer
Автор

The music is very distracting. Please tone down in the future

jack_fischer
Автор

Excellent video. Thank you and well done

nickvanrensburg
Автор

Love this Channel! Could we also look at the hunger for energy consumption and the impact for climate change?

lerwenliu
Автор

No wonder nvidia don't care about consumer GPU anymore.

IAMNOTRANA
Автор

it would be so cool if this main speaker was a clone

shwiftymemelord
Автор

AI and cloud computing face power supply issue as cryptocurrencies?

"Cryptocurrency mining, mostly for Bitcoin, draws up to 2, 600 megawatts
from the regional power grid—about the same as the city of Austin."

billpabq
Автор

Thanks for video but 4 mins before getting to any details in a 15 min video?

MegaVin
Автор

1:24 Ehm… I would like to know, what camera and lens/focal length you use to match the boom arm and background bokeh so perfectly 🤐

dinoscheidt
Автор

A slightly different way of looking at Moore's Law is not about being "dead", but rather becoming irrelevant. Quantum computing operates very differently than binary digital computation, it's irrelevant to compare these two separate domains in terms of "how many transistors" can fit into a 2D region of space, or a FOPS performance. Aside from extreme parallelism available in QC, the next stage from "here" is in optical computing, utilizing photons instead of electrons as the computational mechanism. Also, scalable analog computing ICs (for AI engines) are being developed (IBM for example) . . . Moore's Law isn't relevant in any of these.

SynthoidSounds
Автор

AI power consumption has doomed it to failure before it has started?

billpabq