Deep-dive into the technology of AMD's MI300

preview_player
Показать описание

AMD's MI300 combines five next-gen technologies into one 147 billion transistor chip monster. In this video we will take a closer look at the tech AMD uses to combine CDNA3 GPU cores with Zen 4 CPU cores to create the worlds first HPC AI APU and talk about AMD's strategy for reaching Zettascale in the future.

0:00 Intro
0:42 Five next-gen technologies of MI300
3:19 MI300 Specs
4:36 Chiplet Design
8:06 Brilliant
9:15 Advanced Packaging / 3D Stacking
11:48 SoC / APU Design
13:15 Unified Memory
14:16 AI / ML Acceleration
15:15 AMD's Zettascale Strategy
Рекомендации по теме
Комментарии
Автор

Do you think all five technologies will "trickle down" into the consumer market? Will we buy a single "gaming SoC" at some point in the future?

HighYield
Автор

One correction I would mention is. The latest version of PyTorch. PyTorch 2.0 is moving away from CUDA. For ever increasing AI models CUDA on its own can't optimize for the scale. The frameworks themselves have to be optimization aware. This is why ML frameworks are shifting from the eager mode to the graph mode, which sidesteps CUDA (cuDNN) and provides better performance. Instead of using CUDA, they will use tools like Triton (this is what Open AI's ChatGPT uses) which interfaces directly with Nvidia's NVPTX and AMD's LLVM AMDGPU backends. So CUDA is on its way out and with it Nvidia's software moat. MI 300 will be a monster.

SirMo
Автор

This maybe the first step of the war coming between AMD and Nvidia, I'm waiting for Intel to react, but the advances that AMD is making are huge.

lefthornet
Автор

Something I wanted to add: since we don't know the packaging method yet, when I talk about the "interposer" it doesn't have to be a large silicon interposer, it might l be a small "organic interposer" like on Navi31, using TSMCs "Fan-Out" technology. Once we know more another video will follow!

HighYield
Автор

Incidentally when AMD was buying ATI for Radeon the "Fusion" idea of not just seamless GPU fp compute but a unified address space was used as justification.
It's over a decade but at last this becomes more feasible.
Not just MI300 but SAM and recent DX12 extensions are aimed at shared address space.

RobBCactive
Автор

your videos are amazing and i learn so much new interesting information even though i don't understand everything its still rewarding to watch you explain and develop my own knowledge just as you did. thanks for that and greetings from hamburg, germany :)

hanspeter
Автор

I think you are spot on.
Though I would say that another tech to look out for is in-chip fluid cooling. Heat is a huge problem, especially with 3D stacking. Efficient extraction of heat allows for higher frequencies and lower energy use (as heat increases resistance).

bartomiejpopielarz
Автор

Awesome job covering the mi300 it’s so impressive and beyond anything ever made no one knows how to cover it or even talk about it. The technology is going to make it into gaming consoles. I predict next gen is going to be so integrated the thought of adding ram and separate cards to a pc will feel ancient

jelipebands
Автор

1. Excellent video. I don't follow the data center innovations that closely, I'm more of a desktop gaming guy, so this video was absolutely fascinating to me. Well explained, well segmented. And it's exciting to think about what this will mean for the desktop for the upcoming decade.
2. Before you introduced the 5 new technologies, I paused the video and gave it a quick think of myself. I basically came up with the same categories. Except I came up with "heterogeneous design". In my head that was something that takes the SoC and disaggregates it into chiplets but also includes mixing process nodes and possibly chiplets made by different vendors / foundries. We're not quite there yet. But in my head mixing 5 with 6 nanometers is a part of it. So I basically mushed your "SoC" and "Packaging" category together and added a bit of my own flavor.
3. The classic 'von Neumann' architecture on PC can't keep up anymore. We see this with the consoles, how a smaller, much cheaper design can yield incredible performance. Mid to high-end PCs that cost 3 times more struggle to play the latest console game ports. This is ridiculous, somethings got to change. I'm curious how a next gen PC architecture will look like. Will we still have a modular design, how will cooling look like and will manufacturers be able to agree on a standard in time before consoles make the PC look even more boomer than it already looks to some people?
4. Exciting times ahead.

earllemongrab
Автор

AMD is way ahead of the curve vs the competition. They just need someone to market the tech better. They are a true heterogenious system and get better and better every year. Now AMD is sharing GDDR with CPU / GPU and other AI accelerators.

WXSTANG
Автор

Fugaku with the Fujitsu A64FX walked so El Capitan and Mi300 could run. Seriously.

BrandonMeyer
Автор

MI300 is something I've been waiting for since I saw the initial HPC chiplet APU patent... The interesting thing is that older CPUs used to remove functions from the CPU die because of limited transistors at 180nm etc...

But the biggest thing about it is that I heard some autonomous driving folks say that 2000 TOPs are needed for an FSD experience and since MI250X has 383 TOPS, 8X that is almost 3500 TOPS... AMD can now theoretically provide all the chips for automobiles out of nowhere it seems (NOT!!!)... They can use an edge like appliance with a Pensando front end for network relays for traffic and weather, etc. for a LARGE MAP area, while an upcoming Ryzen APU can do the entire system, including 4K video and gaming... Companies are selling mini PCs with Ryzen and Radeon 6000 that can do 4K30+... Zen4 telco servers can do edge processing while EPYC can stream games and all types of data including AI for predictive routing...

ChristianHowell
Автор

The packaging/chiplet design is quite brilliant (speaking of, gratz on the sponsorship)! One day, hopefully we'll see all of these techniques trickle down to desktop/consumers! The Zettascale strategy is interesting because it pulls you into real world limitations, that is physics, that will inevitable halt performance if we don't invest in new techniques. Like with 3D V-Cache, although is a great solution for more L3$, there are still thermal limits. AMD investing in RnD is a long term goal. And investing and brute forcing into todays technologies like monolithic designs, we'll see in the near future to be unreasonable.

NootNoot.
Автор

1) LightMatter wafer scale optical interconnect
2) Ultraram replacing most chiplet cache, HBM, DRAM, and NVM
3) Accelerators on chip/package
4) Combining CPU, GPU, FPGA on package
5) Backside power delivery
6) VTFET
7) Deep trench capacitor on wafer with direct 3D bonding integration
8) Glass based motherboards with integrated photonics, power deliver, and microfluidic cooling

peterdoyle
Автор

To me, the most memorable part of the keynote in the entire Zettascale race was logic on memory.
I can't really imagine just how much you could realistically put on RAM, probably only basic math operations as anything too complex would probably be too costly.

But if you can even just do basic math, even just add/substract/jump, it'd be a true revolution. So many basic operations would be loaded off the CPU and live in the RAM. The CPU would just have to send the request and that would seriously take down transfers. You could go from 20 transfers and operations down to something like one CPU -> RAM transfer, operations by the ram, and then RAM -> CPU transfer when done to send back the updated data the CPU wanted.

It's truly revolutionary in speed and efficiency. How costly/plausible...don't know. But I find it to be the most impressive thing.

OneAngrehCat
Автор

And with chiplets design, AMD can scale their products way easier than competitor. Shown last week, MI300 has 2 variants, "A" with 6 GCDs and 3 CCDs, and "X" with basically all CCDs replaced entirely with GCDs, making it GPU only. This modularity is going to please any kinds of customers.

lasbrujazz
Автор

Excellent vid, thank u for making it!

D.u.d.e.r
Автор

I hope all manufacturers start stacking chips and putting them in our desktops. Gonna get some cool stuff soon.

closerlookcrime
Автор

I felt this is best ad transition ever. It kinda convinced me to learn some on brilliant 😂

u-def
Автор

Interconnects are the key to new age of 3D stack chips I think. We will get to a point where the processor is not 2d but more like a solid cube. Inside this solid cube is all semiconductor.

wayofflow