How to Turn Your AMD GPU into a Local LLM Beast: A Beginner's Guide with ROCm

preview_player
Показать описание
Products provided by Gigabyte
Those of us with NVIDIA GPUs, particularly ones with enough VRAM, have been able to run large language models locally for quite a while. I did a guide last year showing you how to run Vicuna locally, but that really only worked on NVIDIA GPUs. Support has improved a little, including running it all on your CPU instead, but with the launch of AMD’s ROCm software, it’s now not only possible to run large language models locally, but insanely easy. Now if you are just here for the guide, skip here. If you want to know how this works so damn well, stick around!

Use referral code "techteamgb20" when signing up!

As an Amazon Associate I earn from qualifying purchases, using the links below or other Amazon affiliate links here.

Bitcoin donations: 1PqsJeJsDbNEECjCKbQ2DsQxJWYqmTvt4E

- My PC - AMAZON AFFILIATE LINKS -

Рекомендации по теме
Комментарии
Автор

Thanks for letting us know about this new release. Just tried it on my 6800xt, and it works. FYI, I think the supported list is all Navi 21 cards and all RDNA 3. That's the same list as the HIP SDK supported cards on the AMD ROCm Windows System Requirements page.

misterpdj
Автор

I've successfully utilized 70B models with 4-bit quantization on my 4070ti Super. I offload 27 out of 80 layers partially, while the remainder utilizes the RAM. It functions quite well—not exceedingly fast, but sufficiently for comfortable operation. A minimum of 64GB of RAM is required. While VRAM is significant, in reality, you can operate 70B networks with even 10GB of VRAM or less. It ultimately depends on the model's response time to your queries.

ThePawel
Автор

It works awesome on the 6800xt. Thankyou for the guide.

djmccullough
Автор

Thanks you for the video, I can now use 8B large LLM models with my AMD RX 7600(8GB) and it is really fast. I use Arch Linux and it runs without any problems 👍

sebidev
Автор

brilliant! Thanks for letting us know, I am excited to try this

cj_zak
Автор

Thanks, worked for me very well on my 6800xt! The answers are as quick as in the video. But I guess I need to learn how and what to ask, because the answers were always very confident and always completely wrong and made-up. I asked the chat to make a list of French kings who were married off before they were 18 yo, and it invented a bunch of Kings that never lived, and said that Emperor Napoleon Bonaparte and President Macron were both married off at 16, but they were not kings technically, and they were certainly not married at 16, lol.

myroslav
Автор

Will be trying this out later on, thank you my man.

pedromartins
Автор

Amazing video, I learnt a lot! I love these videos about commerical GPUs running AI/ML workloads as I'm into developing AL/ML models.

joshuat
Автор

Its not working for me, I have a 7900xt installed and attempted the same as you but it just gets an error message with no apparent reason. Drivers up to date and everything in order but nothing

MiguelGonzalez-nvrt
Автор

Well, it is not like GPGPU came just with LLMs. OpenCL on AMD GPUs in 2013 and before was the most viable option for crypto mining, while Nvidia was too slow at that time due to small cache size and poor efficiency. All changed with 750ti and gtx9xx generation of cards. History of GPU programming is even longer than that as people were trying to bend even fixed pipeline GPUs to calculate something unrelated to graphics. Geforce 8 with early and limited CUDA was of course a game changer and I am a big fan of CUDA and OpenCL since then. Thanks for a great video on 7600XT! ❤

VasilijP
Автор

gpu not detected on rx 6800 windows 10. edit: nvm must load model first from the top center.

losttale
Автор

Does anyone know of a way to make an RX 580 run with ROCm on Windows? Yes, it's old, but it would be better than using the processor to play with A.I. and there are plenty of RX580s out there.

studiomusicflow
Автор

Are any of these models that we can run locally uncensored/unrestricted?

rdsii
Автор

wait 30bilion parameter model are fine with GGUF and 16gb even with 12 is something that im missing ?

barderino
Автор

Thanks, the only good video I could find on yt which explained everything easily. Your accent helped me focus. Very useful stuff.

jakeastles
Автор

Seeing as how I spent last night trying to install ROCm without any luck, nor could I find any good tutorials or a single success story, I'll be curious to see how insanely easy this is. Wait, I don't need to install and run ROCm in WSL?

BigFarm_ah
Автор

Can you add multiple AMDs together increasing the power?

mysticalread
Автор

As a total dummy all things LLM your video was the catalyst I needed to entertain the idea of learning about all this AI stuff. I'm wondering and this would be a greatly appreciated video if you make it, is it possible to put this gpu to my streaming pc and it encodes and uploads stream and at the same time runs a local LLM that interacts with the chat on twitch. How can I integrate these models with my twitch streams?

ols
Автор

Is there rx-580 support, who knows for sure? (it's not on the list of ROCm that's why I'm asking) or at list does it work with RX6600M 'cause I see in compatible list only RX6600XT.

jbnrusnya_should_be_punished
Автор

AM I required to install AMD HIP SDK for Windows first before I can use LLM studio?

dougf