M4 Mac Mini CLUSTER 🤯

preview_player
Показать описание
The M4 Mac Mini working together as a cluster.

Use COUPON: ZISKIND10

🛒 Gear Links 🛒

🎥 Related Videos 🎥

* 🛠️ Mini PC portable setup -
* 🛠️ Dev setup on Mac -

— — — — — — — — —

❤️ SUBSCRIBE TO MY YOUTUBE CHANNEL 📺

— — — — — — — — —

Join this channel to get access to perks:

— — — — — — — — —

#minipcs #macmini #m4pro
Рекомендации по теме
Комментарии
Автор

Thank you Alex, I was thinking of doing this exact project! One note: by using a hub you are creating a star topology with a 40Gbps bottleneck shared between all machines. If you used a partially meshed ring topology, you could connect each mini to 3 other minis with a connectivity set of 1:{5, 2, 3} 2:{1, 3, 4} 3:{2, 4, 1} 4:{3, 5, 2} 5:{4, 1, 2}. I'd be interesting in seeing if this improved performance. Another potential advantage of the mini cluster vs a single M4 Max is that all M4-series chips have the same 16 ANE cores; you might be able to run distributed inference on the neural engine to benefit from that scaling.

markclayton
Автор

For the bottom mac being hotter, try giving it space under it like you have the others above, might be heat soaking because it cant dissipate the heat like the others can

fenstermakerwj
Автор

MacOS servers should be a thing again with M model chips honestly

AKagNA
Автор

2000$ in computers, 500$ in Thunderbolt cables.

seaweed
Автор

Exolabs actually processes the parts of the model segmented onto each Mac sequentially, not in parallel which means it's slower than it would be running it on one machine with a lot of ram due to connection delay. However, if Exolabs supported Mixture of Experts models and allowed the experts to be split between devices, that would give insane performance when using all experts compared to doing it on one device because all the experts could be run in parallel.

meh
Автор

I like the narrative style without getting too deep into the technical weeds and letting the screen do that talking.

nicholasthon
Автор

This kind of investigation is great and truly valuable. It’s a significant contribution to the community. This could become a rabbit hole once you start running tests. Thanks for sharing!

davidguerrero
Автор

A Mac Mini cluster with MLX and ExoLabs makes sense if you like extending context for the models.
If you just need the model for a "Hello" or "Write a story" query, a single machine would be sufficient for the task.

MrBlogbar
Автор

Finally we get to see the results! You've hinted and showed the racked minis in many prior videos, I was going mad!

Derpalerpa
Автор

Correct me if I am wrong. According to your video:
1.If the LLM model can fit in just one Mac mini pro, we will get maximum tokens per second. Adding more Mac minis actually decreases TPS, rather than increasing it.
2.If the LLM model is too big for a single Mac mini, the only way to run the model is by using an exo cluster. However, in this case, the TPS will be very low.

So I am wondering wouldn't it be better to buy a MacBook Pro M4 Max with 128GB RAM instead of 5 Mac minis? It might be cheaper, and the performance would be much better than an exo cluster with Mac minis (maybe 2 or 3 times on TPS?)

no offense, just want to understand

zhanglance
Автор

You could really see the joy on you face in this video. Like a kid in a candy store. :)

thewalabee
Автор

Thanks! Very interested in how this all plays out.

bakermd
Автор

Can you daisy chain the thunderbolt connections? This would eliminate the hub.

mpsii
Автор

CUDA clusters have 'unified' memory between GPUs thanks BlueField DPUs, so one GPU could be connected to 1TB of memory easily.
It's important to have all GPUs working ON THE SAME MEMORY during training.
Not all models are compatible with clusters & separate memory spaces.

VanillaIceCoffee
Автор

Booyy ooo boyyy ooo This is one heck of a video about M4 mac mini! Such a creative video man! Loved it!

itiswhatitis-yes
Автор

Can you run more tests and show a chart with the results per $ spent? We are trying to figure out if it makes more sense to buy 4 base models in a cluster vs spending the same amount on a maxed out m4 pro or max.

beaumac
Автор

I’ve always imagined trying LLM clustering using a Mac, and you’ve turned that imagination into reality—I’m so thrilled! Additionally, if there are any Thunderbolt network issues, it would be great to create a video showcasing how to use a 10G network instead. This video is incredibly valuable and practical. Thank you so much for your hard work in making it. I truly appreciate it! ❤❤❤

FUNLIFEG
Автор

Awesome bit of bench work Alex. You always amaze me how you make things can be hard to unpack and make them wholly engaging and great opportunities for learning. Keep. It. Up.

GoDuffdaddy
Автор

I was waiting for this video for so long, thanks Alex. ❤

elkortby
Автор

If it's Thunderbolt Bridge, can you try daisy-chaining Thunderbolt connections to get rid of the bridge?

Alexthesurfer