INSANE Ollama AI Home Server - Quad 3090 Hardware Build, Costs, Tips and Tricks

preview_player
Показать описание
Interested in Training your own Ai Models? Wanting to speed up your inference from a Dual GPU, locally hosted ai home server? This video is a must watch. You can run A LOT of containers against a small fleet of GPUs like this. I cover the complete hardware build, drop a ton of mounting, cooling, pcie and gpu tips and tricks for this EPYC server sporting 512GB of DDR4. Ollama GPU servers have make use of multiple GPUs automatically so I am building a monster rig that we will be testing out. 👇 All Parts Used Linked Below 👇

Written piece and GPU Rack Modification Instructions

(sTRX4 fits SP3 and retention kit comes with the CAPELLIX)

Be sure to 👍✅Subscribe✅👍 for more content like this!

Thanks for watching!

Digital Spaceport Website

🛒Shop (Channel members get a 3% or 5% discount)

*****
As an Amazon Associate I earn from qualifying purchases.

When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network.

Other Merchant Affiliate Partners for this site include, but are not limited to, Newegg and Best Buy. I earn a commission if you click on links and make a purchase from the merchant.
*****

0:00 Intro
1:09 Which motherboard
5:49 GPU rack frame
6:59 Which power supply
10:16 Water cooling
12:08 Wattage vs other servers
13:43 How much did it cost
20:37 Conclusions
Рекомендации по теме
Комментарии
Автор

Do you have any recommendations for an air cooler for a CPU?

TuanAnhLe-efyk
Автор

Why limit gpus' TDP?!? Just add another PSU! 4x3090 is 1400w already! 512ram and 7702 cpu is another 500w, so one more PSU, 750w minimum, it costs nothing compared to the price of the system. And with 1500w you don't want to run it on the max limits, keep 20% reserve, If you want stable reliable system, your gpus has to be limited to 150w instead of default 350, that's a huge hit!

treniotajuodvarnis
Автор

Nice one! I build a 8xA4000 epyc server which was...epic! 128GB vram <3

isbestlizard
Автор

that odd fan is making me go crazy LOL

bekappa
Автор

woud also be nice for Rendering blender scenes

ziozzot
Автор

You should get hailo to sponsor a video with their 8 or 10h m.2 module
Also howmany tops is this setup?

mastermoarman
Автор

It remind me those Miner: dejavu I have been in this place before 😂

frankwong
Автор

I don’t understand something with this setup aren’t you limited to just small LLMs. Mainly because only 2 RTX 3090 can sync together via NVLink so you essentially have 2 sets of pair of RTX with you four cards.
Also, I wondering about PCIe bottleneck.

Lastly, would advise to get a big enough RAM to load the entire 300Billion parameter LLM which works out to about 1.2 Tbytes.

If you could please discuss the limitation with this setup?

HKashaf
Автор

Thanks for review. I have Asus z10pe-d16 ws main board, 2x xeon 2683 v3, 8x 16gb ddr4 2133p, 5x 3090 and many corsair a 1500i PSU. Tried 70b q8 and q4 and 405b q2. They are extremely slow. What do I miss? What is 4i sff 8654?Ty

SuperSayiyajin
Автор

Any specific reason for going with the XianXian GPU rack instead of "AAAwave The Sluice V.2"?

mams
Автор

I'd love to do something like this, and I have some reasonable hardware to make it happen, but I straight up don't have the power. What do you use as a power source? giant solar array? my power in CT just went up to .35/kwh

joshhardin
Автор

Nice video. Work on that audio though. The voice overs sound off.

sarahracing
Автор

2x 4090 is better than 4x 3090 by all means.

大支爺
Автор

Does it work to mix diferent generation of gpus like rtx 30 and rtx 40? Well job!

LucasAlves-bspf
Автор

Really cool build, I also have a 4 gpu rig.

The only thing that I would recomend is trying to give some space between the GPUs as maximum as possible, because they being close to each other will generate a lot of heat, the difference is enourmous.

I would also add extra coolers to the GPUs, I personally like the maxmium of 1 gpu per 120mm cooler, and the coolers blowing ar direct to the GPU.

I'm not sure if a watter cooler is a good Idea here, I'm saying that because no server uses watter cooling, neither CPU miners (people using the cpu on 100% 24/7), because watter coller tends to stop refrigerenting at some point, and it doesn't have the best efficiency. I'm not sure also if a 120mm cooler will fit on your build, I'm just giving food for tought

arturschuch
Автор

Nice build! Have you ran many training workloads on it?

The single core perf of the 7702, even with boost, is pretty mediocre. I fear it would bottleneck training unless you spend a bunch of time optimizing data loading code. I went with a threadripper pro for my 4x3090 for this reason but always wondered how a 7702 would preform.

taxplum
Автор

Please, test mixing diferent VRAM size cards like 3090 (24 gb) and 4070 (12 gb). Can it balance the work in a way that don't crash when hit the 12 gb mark?

LucasAlves-bspf
Автор

Good stuff, man. Looking forward to what the performance will be like.

Drkayb
Автор

When it comes to tabs, I am your wife to a T lolol, thanks for the shoutout. Loved the video!

krtman
Автор

Hi dear, awesome totally, how can I build a server to get performance equal to AMD Ryzen 7995X Threadripper Pro, with RTX 4090, and 128 GB 6400MHz Ram with Pcei 5.0 NVMe? I am doing research on building a Server for training my AI &ML models I considered AWS but its very costly so I am considering my own Server

husratmehmood