How to Choose an NVIDIA GPU for Deep Learning in 2023: Ada, Ampere, GeForce, NVIDIA RTX Compared

preview_player
Показать описание
If you are thinking about buying one... or two... GPUs for your deep learning computer, you must consider options like Ada, 30-series, 40-series, Ampere, and GeForce. These brands are rapidly evolving. In this video, I discuss this ever-changing landscape as of January 2023.

There are many changes this year. NVLink is no more, unless you are dealing with the Hopper server class. The latest generation PCIe bus handles the cross GPU data transfer.

0:05 Assumptions
1:08 NVIDIA RTX (Pro) or GeForce??
3:27 NVIDIA GeForce
5:55 Memory is King
6:05 Suggestions

~~~~~~~~~~~~~~~ MY DEEP LEARNING COURSE ~~~~~~~~~~~~~~~

~~~~~~~~~~~~~~~ CONNECT ~~~~~~~~~~~~~~~

~~~~~~~~~~~~~~ SUPPORT ME 🙏~~~~~~~~~~~~~~

~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#gpu #cuda #lovelace #ada #ampere #Python #Tensorflow #Keras #nvidia #3080 #3090 #4080 #4090 #pytorch
Рекомендации по теме
Комментарии
Автор

I'm a beginner in ML, and got a used RTX3060 with 12GB for $200 off eBay. No regrets and for now meets my needs. A good upgrade from my old GTX 970!

FiveFishAudio
Автор

Great video, thanks Jeff. I bought a RTX 3060 for $340 ($280 as of 6/23) shipped from B&H on Black Friday. The 12 GB of RAM at that price was the deciding factor. BEWARE: NVidia is now selling a "3060" with 8GB RAM in addition to the 12 GB version, so read the specs carefully before you buy.

datapro
Автор

I like your explainers @jeff Heaton.

Don't forget, the professional workstation RTX models use ECC RAM, use less power, and have a profile that fits in most cases. The GeForce models are enormous!
In my line of work - ECC RAM and TDW is a huge deal for the targeted AI/ML processing. I don't play computer games, nor do I imagine I ever will. I may have an Xbox to de-stress occasionally with NBA 2K15 (yep, that's how invested I am in gaming).

If I could share images - I'd share my matrix comparison of specs and current price differences.

There's no question that you get a big bump from the RTX 4000 Ada to the previous gen RTX A5000. But, the 4000 Ada uses a TDP 70W (no power cable necessary!) vs 230W, has a fair amount of Tensor Cores and RAM, and it's MUCH smaller.

Each one of these video cards needs to be matched to the expected workload. They're just too expensive not to optimize the workload mission. For instance, if you look at what's benchmarked at techpowerup for supposedly equal GPU's, the later-gen RTX A5000 crushes the RTX 4070 (4070 FP64 = 455.4 GFLOPS vs A5000 FP64 = 867.8 GFLOPS).

Here's the Price Ladder:
PNY NVIDIA RTX 4000 SFF Ada Gen $1, 500
PNY NVIDIA RTX A5000 $1, 670-$1, 500 = $170 more
PNY NVIDIA RTX A5500 $2, 289-$1, 670 = $619 more
PNY NVIDIA RTX A6000 48GB GDDR6 $3, 979-$2, 289 = $1, 690 more
PNY NVIDIA RTX 6000 Ada 48 GB GDDR6 $7, 985-$3, 979 = $4, 006 more

Those are REALLY big steps-up in cost!

Links in order:

jeffm
Автор

In Montreal right now. Got a 3090 for like 750 Canadian brand new on marketplace. Your videos are always helpful. I am doing a Masters in ML here and even though I have cluster access the personal GPU does help a lot especially with smaller assignments and projects. RoBERTa with my dataset for an NLP task took 2 hours for training and 2 for fine tuning with 8 epochs on pytorch, if anyone is curious. Mixed precision would have helped speed it up even more most likely.

TheTakenKing
Автор

I found a 3060 12gb that was taken out of an oem on amazon about a year ago for about $480 and it has been pretty good.

DavePetrillo
Автор

@Jeff, one thing more that goes in favour of RTX 40 series is its ability to support FP8 bit. Going forward this year, Nvidia is going to release CUDA 12 with the support of FP8 bits for training and inference. You can only run FP8 bits on RTX 40 series (not on RTX 30).
I think that is something one should also consider while buying a GPU. It is better to shed a bit more now so that your GPU remains relevant for next 3-4 years.

aayushgarg
Автор

Thank you for the wonderful video! currently using a 3060, seeking to move to a 40s. will be waiting for new videos with the new setup ready!

yoloeva
Автор

Hello, your video is incredible, an update is necessary in this regard with the new 4060Ti 16GB graphics card from NVIDIA. Considering the memory model, bandwidth and bus... is it fair to sacrifice all that for more VRAM, are we facing the best graphics for ML?

What do you think about it?

FzZiO
Автор

You're missing the most important spec for performance in these kind of applications: VRAM bandwidth. RTX 40/Ada in this regard is abysmal value, and RTX 30 is much better, equally fast or even faster (3080 10GB is 7% faster than 4080), for half the price.

ProjectPhysX
Автор

Amazing video, Jeff! Please, could you show some models running and how much memory they consumes? Thank you !!!

JoseSilva-gtzj
Автор

Great video! Thanks for your efforts. What do you think about A5000 card in terms of deep learning performance? Regarding its specs, I assume it can be considered as an alternative to 4070 Ti. Would you suggest 4090 or A5000 for a deep learning research regardless of their prices?

ggpykut
Автор

I'm looking forward to FP8, which should be coming to the 4090's as a software update. That will allow running 20B parameter models at full speed with 24GB of RAM and little loss in quality and some performance gain since it needs less memory bandwidth. But I'll probably be renting 4090's on runpod; for my own machine, I just bought a used Dell precision 7720 with a Quadro P5000 with 16GB of VRAM. It runs the 7B Alpaca/LLaMA models at 7 tokens/second. It also runs whisper at slightly faster than real time Speech to Text, which was around 30% the speed of a 3090.

(I even experimented with 3 bit quantization for the LLaMA 30B model, but it wasn't worth it because it was super slow and it still ran out of RAM for large queries.)

nathanbanks
Автор

I've noticed something interesting here in Europe. Used 3090's seem significantly cheaper than I remember them a few months ago (no surprise I guess). It's priced somewhere between a 4070ti and a 4080 (but closer to the 4070). I'm seriously considering getting a used 3090 for that sweet, sweet VRAM.
I'm building my 1st PC right now. Deep learning is more a hobby for me - nothing at all related to my day job, etc.

xntumrfoivrnwf
Автор

Great video Professor Heaton. I appreciate your opinion and all the effort in putting out these videos. Your guidance has assisted me in making choices with my PC builds for future machine learning. I recently updated my 2019/2020 ML desktop with additional memory and a RTX 3090 Ti 24GB FE that I purchase at $1, 599.99 in mid-2022 (slightly below MSRP of 1, 999.99). Two months later, I was able to purchase a RTX 3090 Ti 24GB FE for $1, 099.99 (still available at this price directly from NVIDIA), which I currently use in an eGPU for accelerating GPU performance in a thunderbolt-4 intel laptop (Windows 11 and WSL-2 works great, and now trying to get it to work with native Ubuntu). I’m also currently building a new small form factor (SFF) desktop for my home office, but I’m waiting for the RTX 4090 24GB FE to be available at MSRP (I’m very reluctant to buy anything above MSRP). I feel the RTX 3090 Ti 24GB FE at $1, 099.99 and RTX 4090 at MSRP are better choices over the RTX 4080!

techtran
Автор

There is also the budget option to get used cards from the nvidia tesla series, mainly the p100 and v100. There are a pain to set up and generally hard to find at a reasonable price but absolutely worth it if you only care about deep learning. I was using a p100 for 250€ until I got my hands on the 32gb v100 for 700€ (which is an absolute steal btw).

shafiiganjeh
Автор

You really need to also consider the TensorFlow and PyTorch compatibility with the 40 series as of now. If one needs to dig right in, then getting 30 series is the right choice for him or her.

ericp
Автор

what is your opinion on 4060 ti 16GB? I want to use the langchain models etc.

pranjalable
Автор

Really can agree on VRAM. That's unfortunate that nvidia had a big gap between 3060 and 3080ti. And still, 12gb is not much. Would be great to see more VRAM in future. I saw a rumor of a 3070ti with 16gb, but never found in in the wild.
In addition to information from this video, it would be nice to know how memory bandwidth can affect the speed. If you have any benchmarks or info, please share
Thanks!

Argoo
Автор

Hey Jeff, thanks for the great content. Have you needed around with AMD & ROCm at all? Would be interesting to learn more about that.

henry
Автор

I have MSI Ventus GeForce RTX 3060 12GB. In my opinion, it's a little slow for DL but it gets the job done; takes a while!

saminmahmud