What are Apple's GPU cores?

preview_player
Показать описание
Ever wonder why Apple lists their GPU Cores like 64 and 76, but an Nvidia RTX 4080 has 8,704 CUDA cores?

At the highest level is the Graphics Processing Unit, aka the GPU which is a parallel processor, which is optimized for multiple instructions to be executed in parallel, opposed traditional CPUs being optimized for sequential processing. The data is funneled to streaming multiprocessors, which in Apple's vernacular are referred to as cores.

In Apple Silicon, at least for the M1 series, Each core is split into 16 Execution Units, each with 8 Arithmetic Logic Units (ALUs). For example, the top end M1 Ultra has 64 Cores with 1024 Execution Units or 8192 ALUs.

My Patreon:

---
Minor Correction:
In the video, I mentioned Intel Macs had a shared memory design which is true, which I really should have clarified as the GPU has a reserved pool of RAM and not the same as Apple Silicon which is a much superior design as the processing (vertex shaders, pixel shaders, texture units) part of the same pipeline.

Sources:

Рекомендации по теме
Комментарии
Автор

Minor correction:
In the video, I mentioned Intel Macs had a shared memory design which is true, which I really should have clarified as the GPU has a reserved pool of RAM and not the same as Apple Silicon which is a much superior design as the processing (vertex shaders, pixel shaders, texture units) part of the same pipeline.

dmug
Автор

Finally a reasonable explanation of what “cores” are on the very different platforms. Thank you. It seems that right up to the Studio Ultra (small desktop) the tremendous power savings and quiet operation is the main selling feature of the M architecture, and graphics performance is good enough. If the M3 doesn’t change design to include thousands of additional external PCI lanes plus that hybrid graphics architecture you mentioned Apple had patterned, I would say they’re completely done competing in that space.

bobsykes
Автор

Really informative and helpful. Thanks for making a really clear breakdown between Apple's GPU cores vs. NVIDIA's

KrishnaDraws
Автор

Apple might have some really good technology and what not, but it all comes down to 3rd party apps adopting such tech. Problem is, apple has always been a moving target. Not to mention all the secrecy which makes it really tough for the 3rd parties to adopt. And the drama with some other companies…

yjchoi
Автор

Apple: we want gamers in our platform.
Also apple: we are not interested.
Amd: we have an Apple silicon killer.
Apple: nah ur lying.
💫strix halo entered the market.
Apple:

Sheerwinter
Автор

Appreciate your honesty. Our studio waited so long for Apple to catch up to PC. The M's were incredibly disappointing. We finally bit the bullet, and switched to PC for the first time in 14 years and we're so happy

slytherben
Автор

This whole video was literally what I've been asking my brain about all week. I thought I'd never get an answer. Thanks for reaching into my mind and making a video about it

mfjae
Автор

Good editorial sir.

I enjoyed listening to your take. What I like about Apple SOC is the lack of heat during basic operations. 😮

WarriorsPhoto
Автор

I think they are trying to force the industry into developing for their integrated GPU before they start thinking about external GPU. To be honest, I dont think Apple ever dreamed of the success of Apple silicon.

nyambe
Автор

One thing I will never do is be on "team regarding a computer. Great video!

curtiswindover
Автор

thank you for explaining this apple-marketing-naming

tonyk
Автор

This video is good but there’s two unmentioned nerdy details which have a big impact on performance.

Cache hierarchy and dispatch style.

Within a GPU process a lot of data is used multiple times. Unified memory is fast, but it’s not fast enough to keep the GPU fed. With this in mind the M series chips include a huge amount of cache right within the GPU cores so frequently used data is fast to access. NVIDIA uses cache layering too, but the caches used are smaller and slower. It’s part of why they consume more power than the M series chips as data is being shuffled around more often.

The second interesting difference is dispatch style.

The NVIDIA approach is to split the task across huge numbers of fairly slow cores which operate in parallel to produce a frame. Each core produces a single tile

The Apple GPU’s are ‘deferred tile renderers’. There are fewer cores, but each core is much faster and can render multiple tiles per frame per core.

The difference is dispatch approach is why ‘core’ numbers seem so different but the performance isn’t as far off as numbers would suggest.

Apple take the ‘few but fast’ approach rather than the ‘many but slow’ approach used by others. Software needs to be written with this in mind. It’s not hard. But it does require a bit of engineering time to get the best performance.

I hope that’s a useful bit of context for why an apples, to erm, apples, comparison so so tricky.

The power efficiency side of things is also deeply impressive, though less relevant for desktop gaming.

If Apple ever made a competitively priced server component with the performance per watt of the M series part they would give NVIDIA a big headache. Luckily… Apple probably don’t have the fab capacity to compete in that market.

jamieknight
Автор

the level of detail in this video is outstanding!

CubieJetCube
Автор

Hey mate, would you mind doing a follow up video on this looking at the supposedly 'improved ray tracing' on the m4's? Thanks. You make great content; I'm a new happy subscriber!

art-thou-gomeo
Автор

Hey!! I'm on search for a laptop (pref) or a pc that can perform intensive math and physics based simulations and I was wondering if I should go for a mbp, Nvidia(intel) based laptop or just go for a custom built? From my experience Unix based machines do the work. I want to know your views from an hardware perspective.

unexploitedsoul
Автор

In Perf/W apple is quite a bit ahead of NV.

it is worth noting the OpenCL score on macOS is very poor unless your task is only supports openCL since the openCL driver has a LOT of issues, (its deprecated). The math that is done on the Metal test and the OpenCL test in GB6 is the same, and the score is computed in the same way so you can compared between backends infact that is the point. The aim here is so that you can see the massive perf benefit you get from using metal (or in other words how poor the openCl driver is).

hishnash
Автор

Basically :
One Apple GPU core = One NVIDIA SM (streaming multiprocessor) = 2 AMD CUs = 128 FP32 registers (or shader cores / ALUs)

ThibautMahringer
Автор

I think it has to do with cost factor, if nvida would be integrated in to macs, an mac mini would cost around 2500 dollars

bp
Автор

Be nice to see a task based comparison with performance mapped against cost.

Nvidia's high end GPUs do not come cheap. The point is given a budget how much bangs do you get per dollar?

I have an M1 Macbook Pro with 16gb of shared memory I warn people that that means it will fail to render files when it runs out of RAM.

peterbreis
Автор

Hey Greg. Great channel With tons of relevant info. Thanks So Much. I have a question regarding cores: What are the differences between Performance cores, Efficiency Cores and Virtual Cores that I see in my activity monitor but never seem to get used.

Thanks,

AM

Edit: Another question: What are PCi pools? - I just found out about this management aspect of Mac Pro 2023

archetypemusic