INSANE Ollama AI Home Server - Quad 3090 Hardware Build, Costs, Tips and Tricks

Показать описание

Interested in Training your own Ai Models? Wanting to speed up your inference from a Dual GPU, locally hosted ai home server? This video is a must watch. You can run A LOT of containers against a small fleet of GPUs like this. I cover the complete hardware build, drop a ton of mounting, cooling, pcie and gpu tips and tricks for this EPYC server sporting 512GB of DDR4. Ollama GPU servers have make use of multiple GPUs automatically so I am building a monster rig that we will be testing out. 👇 All Parts Used Linked Below 👇

Written piece and GPU Rack Modification Instructions

(sTRX4 fits SP3 and retention kit comes with the CAPELLIX)

Be sure to 👍✅Subscribe✅👍 for more content like this!

Thanks for watching!

Digital Spaceport Website

🛒Shop (Channel members get a 3% or 5% discount)

*****
As an Amazon Associate I earn from qualifying purchases.

When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network.

Other Merchant Affiliate Partners for this site include, but are not limited to, Newegg and Best Buy. I earn a commission if you click on links and make a purchase from the merchant.
*****

0:00 Intro
1:09 Which motherboard
5:49 GPU rack frame
6:59 Which power supply
10:16 Water cooling
12:08 Wattage vs other servers
13:43 How much did it cost
20:37 Conclusions

Рекомендации по теме

Комментарии

Do you have any recommendations for an air cooler for a CPU?

TuanAnhLe-efyk

Why limit gpus' TDP?!? Just add another PSU! 4x3090 is 1400w already! 512ram and 7702 cpu is another 500w, so one more PSU, 750w minimum, it costs nothing compared to the price of the system. And with 1500w you don't want to run it on the max limits, keep 20% reserve, If you want stable reliable system, your gpus has to be limited to 150w instead of default 350, that's a huge hit!

treniotajuodvarnis

Nice one! I build a 8xA4000 epyc server which was...epic! 128GB vram <3

isbestlizard

that odd fan is making me go crazy LOL

bekappa

woud also be nice for Rendering blender scenes

ziozzot

You should get hailo to sponsor a video with their 8 or 10h m.2 module
Also howmany tops is this setup?

mastermoarman

It remind me those Miner: dejavu I have been in this place before 😂

frankwong

I don’t understand something with this setup aren’t you limited to just small LLMs. Mainly because only 2 RTX 3090 can sync together via NVLink so you essentially have 2 sets of pair of RTX with you four cards.
Also, I wondering about PCIe bottleneck.

Lastly, would advise to get a big enough RAM to load the entire 300Billion parameter LLM which works out to about 1.2 Tbytes.

If you could please discuss the limitation with this setup?

HKashaf

Thanks for review. I have Asus z10pe-d16 ws main board, 2x xeon 2683 v3, 8x 16gb ddr4 2133p, 5x 3090 and many corsair a 1500i PSU. Tried 70b q8 and q4 and 405b q2. They are extremely slow. What do I miss? What is 4i sff 8654?Ty

SuperSayiyajin

Any specific reason for going with the XianXian GPU rack instead of "AAAwave The Sluice V.2"?

mams

I'd love to do something like this, and I have some reasonable hardware to make it happen, but I straight up don't have the power. What do you use as a power source? giant solar array? my power in CT just went up to .35/kwh

joshhardin

Nice video. Work on that audio though. The voice overs sound off.

sarahracing

2x 4090 is better than 4x 3090 by all means.

大支爺

Does it work to mix diferent generation of gpus like rtx 30 and rtx 40? Well job!

LucasAlves-bspf

Really cool build, I also have a 4 gpu rig.

The only thing that I would recomend is trying to give some space between the GPUs as maximum as possible, because they being close to each other will generate a lot of heat, the difference is enourmous.

I would also add extra coolers to the GPUs, I personally like the maxmium of 1 gpu per 120mm cooler, and the coolers blowing ar direct to the GPU.

I'm not sure if a watter cooler is a good Idea here, I'm saying that because no server uses watter cooling, neither CPU miners (people using the cpu on 100% 24/7), because watter coller tends to stop refrigerenting at some point, and it doesn't have the best efficiency. I'm not sure also if a 120mm cooler will fit on your build, I'm just giving food for tought

arturschuch

Nice build! Have you ran many training workloads on it?

The single core perf of the 7702, even with boost, is pretty mediocre. I fear it would bottleneck training unless you spend a bunch of time optimizing data loading code. I went with a threadripper pro for my 4x3090 for this reason but always wondered how a 7702 would preform.

taxplum

Please, test mixing diferent VRAM size cards like 3090 (24 gb) and 4070 (12 gb). Can it balance the work in a way that don't crash when hit the 12 gb mark?

LucasAlves-bspf

Good stuff, man. Looking forward to what the performance will be like.

Drkayb

When it comes to tabs, I am your wife to a T lolol, thanks for the shoutout. Loved the video!

krtman

Hi dear, awesome totally, how can I build a server to get performance equal to AMD Ryzen 7995X Threadripper Pro, with RTX 4090, and 128 GB 6400MHz Ram with Pcei 5.0 NVMe? I am doing research on building a Server for training my AI &ML models I considered AWS but its very costly so I am considering my own Server

husratmehmood

INSANE Ollama AI Home Server - Quad 3090 Hardware Build, Costs, Tips and Tricks

INSANE Ollama AI Home Server - Quad 3090 Hardware Build, Costs, Tips and Tricks

host ALL your AI locally

GROK 2 vs. LLAMA 3.1 - Cloud vs Home Server Ai Testing

Can you run an AI LLM on an old server?

Homelab AI Server Multi GPU Benchmarks - Dual 4090s + 1070ti added in (CRAZY Results!)

Run your own AI (but private)

'This Is The ONLY Home Server You Should Buy'

I Ran Advanced LLMs on the Raspberry Pi 5!

Local Ai Models on Quadro P2000 - Homelab testing Gemma Ai, Qwen2, Smollm, Phi 3.5, Llama 3.1

AI/ML/DL GPU Buying Guide 2024: Get the Most AI Power for Your Budget

Llama 3 8B: BIG Step for Local AI Agents! - Full Tutorial (Build Your Own Tools)

6 Best Consumer GPUs For Local LLMs and AI Software in Late 2024

DON'T Use Raspberry Pis for Servers! (Use THIS)

Build Anything with Llama 3 Agents, Here’s How

Engineer Explains: Raspberry Pi is FINALLY Dead, Here's Why

QWEN 2.5 72b Benchmarked - World's Best Open Source Ai Model?

Which nVidia GPU is BEST for Local Generative AI and LLMs in 2024?

Homelab Al Server Multi GPU Benchmarks - Multiple 3090s and 3060ti mixed PCIe VRAM Performance

Run the newest LLM's locally! No GPU needed, no configuration, fast and stable LLM's!

Fine Tune LLaMA 2 In FIVE MINUTES! - 'Perform 10x Better For My Use Case'

Run Local ChatGPT & AI Models on Linux with Ollama

The ULTIMATE Budget Workstation.

LLaMA 405b Fully Tested - Open-Source WINS!

Llama 3.1 is ACTUALLY really good! (and open source)