Which nVidia GPU is BEST for Local Generative AI and LLMs in 2024?

preview_player
Показать описание
Struggling to choose the right Nvidia GPU for your local AI and LLM projects? We put the latest RTX 40 SUPER Series to the test against their predecessors! Discover which card reigns supreme in terms of performance per dollar for running everything from Stable Diffusion to custom language models. Whether you're a hobbyist or a serious developer, this video unveils the best value for your AI journey in 2024.

Tell us what you think in the comments!

This video contains affiliate links, meaning if you click and make a purchase, I may earn a commission at no extra cost to you. Thank you for supporting my channel!

My 4090 machine:

Tech I use to produce my videos:

Рекомендации по теме
Комментарии
Автор

I have a server in my apartment with an EPYC 32Core processor and in one of the slots I have a 3090 + ollama and it does its job pretty good for quantized models. Anything massive, I would just leverage a cloud based gpu but the progress being made so far for local LLM's is really amazing. I noticed that gpt-4 seems to be getting a bit more "lazy or concise" forcing me to use the api + autogen agents for more thoughtful answers.

datpspguy
Автор

The main limiting factor is the amount of VRAM you care about speed only once you can run your model!

jmirodg
Автор

I’ve had my 4080 for over a year now. What I’m able to do on it has grown tremendously with the improvements in quantization.

joshbarron
Автор

Do you think a 4070 ti super 16gb - 12gb vram is worth getting? I currently have a 3060ti 8gb vram and render times are alright, but with lots of prompts/realism it takes forever. Thx in advance

YoItsOO
Автор

Hello, what are your thoughts on connecting gpu to a mini pc or handheld via oculink aka sff 8611/8612? Assuming gen 4 pcie

EzaneeGires
Автор

Just found your channel. Excellent Content! - Another sub for you sir!

andre-le-bone-aparte
Автор

Sorry to understand this, a 70b parameter model could work on a 3070 8gb ram? or are yous aying a 16gb min? Sorry if i am misundersanding you. im interestingin running codestral but currently have a 3070ti 8gb. My next upgrade wiull either be, a 3090 or 4070ti super..

SeanietheSpaceman
Автор

Would I be accurate in stating that AI will be able to take any compatible hardware and give it a massive boost in rendering capabilities, as well as computing ( not only new hardware being developed specifically for it, but older hardware that is capable of running it )?

danieldelewis
Автор

what gpu would you recommend for a laptop though? because it seems like the gpus they put in laptops are slightly different

thefastmeow
Автор

i m buying a 4070
btw i just started getting into LLM`s and ai`s, can u give some tips and channels for begginers it would be appreciated

Gigachadder_
Автор

Ok great intro, exactly the video I've been looking for, thanks!

Juan-wssy
Автор

Is there a big difference in performance and speed in AI tasks like stable diffusion etc between RTX 4080 super and RTX 4090?Which one should i buy as I seldom play games or should i wait for 5090 at the end of the year?I am not a video editor or hold any jobs related to designing or editing, just a casual home user.

uhposxy
Автор

Do you develop on linux via something like wsl? Or do you have a dedicated linux boot? Is there any reasons why I might not want to continue using wsl?

nzt
Автор

The advantage to a personal system for AI work is privacy and security of information and data.
Doing anything in a cloud based service automatically means through the Internet and on a remote server.
Don’t expect privacy no matter what the service provider says.
Does anyone really expect OpenAI or MS to pass up the opportunity to log and record all transactions with GPT-4?
User queries will provide petabytes of new data for training and tracking purposes, why do you think it’s so cheap to use them considering how expensive data center resources are to provide to millions of people for inference.

glenyoung
Автор

What's the approximate relative training speed and inference speed of 3090 vs 4090. I know that inference speed mainly depends on memory bandwidth, what does training speed depend on?

bdjblng
Автор

Any experience with AMD cards? I hear it's getting easier and more support.

Buying used. Can get a 3090 or a 7900XT both for £500. Which one do you recommend?

lobyapatty
Автор

I just picked up some P40's and they seem to work well for inference. I think i'm getting around 2/3 the performance of my 3090's (i have only tested a few models.) For the price they seem hard to beat for a budget GPU if you can get the cooling worked out.

dholzric
Автор

Good vid and great info for someone just looking to learn enough to set up an AI tool for business use. I researched gpus quite a bit and landed on a 4070 thanks to you. I want to build a model on my own data that can spit out info from under 1000 pdfs. I dont want to learn ai engineering to do it so I'm just piecing things together. I found that a lot of devs focus on image generation or the ai model is hooked into online. Happy that more and more offline options are out there now.

southofgrace
Автор

I have a gtx 1650 laptop i5 12th gen and it can run local chatgpt like ai and text to voice and text to picture. But it struggles with text to videos ai. Just to give you idea guys.

handpowers
Автор

what are your thoughts about getting a GPU server and throwing 3 telsa P40's?

ue