filmov
tv
Install and run LLM's Locally with text generation webui on AMD gpu's!
Показать описание
Let's setup and run large language models similar to ChatGPT locally on our AMD gpu's!
### Installing ROCm
sudo apt update
sudo apt install git python3-pip python3-venv python3-dev libstdc++-12-dev
sudo apt update
sudo amdgpu-install --usecase=graphics,rocm
sudo usermod -aG video $USER
sudo usermod -aG render $USER
sudo reboot
### Installing Text Generation Webui
mkdir ~/gpt
cd gpt/
cd text-generation-webui
# Setup virtual env
python3 -m venv venv
source venv/bin/activate
# Install correct torch ROCm5.6
# Install the rest of dependencies
# Installing with no avx support
# Create launch script
# Inside launch script paste:
#!/bin/bash
source venv/bin/activate
export HIP_VISIBLE_DEVICES=0
export HSA_OVERRIDE_GFX_VERSION=11.0.0
### save and exit your launch script
# Make script executable
# Now you can launch webui with your script
Model from the video:
TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ
Some settings to check if models are not loading:
Disable exllama
float 32
fp4
Transformers works most of the time, not always performant though.
Generally load_in_8bit and load_in_4bit will not work -- it uses bitsandbytes which has bad ROCm support.
### Installing ROCm
sudo apt update
sudo apt install git python3-pip python3-venv python3-dev libstdc++-12-dev
sudo apt update
sudo amdgpu-install --usecase=graphics,rocm
sudo usermod -aG video $USER
sudo usermod -aG render $USER
sudo reboot
### Installing Text Generation Webui
mkdir ~/gpt
cd gpt/
cd text-generation-webui
# Setup virtual env
python3 -m venv venv
source venv/bin/activate
# Install correct torch ROCm5.6
# Install the rest of dependencies
# Installing with no avx support
# Create launch script
# Inside launch script paste:
#!/bin/bash
source venv/bin/activate
export HIP_VISIBLE_DEVICES=0
export HSA_OVERRIDE_GFX_VERSION=11.0.0
### save and exit your launch script
# Make script executable
# Now you can launch webui with your script
Model from the video:
TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ
Some settings to check if models are not loading:
Disable exllama
float 32
fp4
Transformers works most of the time, not always performant though.
Generally load_in_8bit and load_in_4bit will not work -- it uses bitsandbytes which has bad ROCm support.
Комментарии