Run ANY LLM Using Cloud GPU and TextGen WebUI (aka OobaBooga)

Показать описание

Enjoy :)

Join My Newsletter for Regular AI Updates 👇🏼

Need AI Consulting? ✅

Rent a GPU (MassedCompute) 🚀
USE CODE "MatthewBerman" for 50% discount

My Links 🔗

Media/Sponsorship Inquiries 📈

Links:

Рекомендации по теме

Комментарии

I appreciate that you find and post these but also walk us through the setup. Huge time saver! Thank you!

jeremybristol

TheBloke is not the author of these models, as stated in the model cards, but provides quantized versions of them.

MrPuschel

while 65B models are definetly beyond reasonable consumer hardware, in order to run 33B models, all you need is 8gb VRAM and 32GB system RAM. I get ~1.1 tokens a second using an rtx 3070 and R5 3600. Meaning you can run a lot of these SOTA models using just pretty cheap local hardware.
Also small correction: The Bloke doesn't make those models, he quantizes them to 4/5bit so that we can all run them. It's super cool that he does that, but he doesn't *make* all those models that you've stated there. Eric Hartford and Tim Dettmers are the 2 big model authors at the moment.

theresalwaysanotherway

Would be amazing with a guide like this specifically for setting up the best model for coding with the largest token context window.... for us plebs who do not have access to Anthropic yet 😁 appreciate your hands-on, get started fast kind of flavor here Matthew!

wolphiekun

Does it cost money to train, and then turn off the GPU and then use it again? And is it impossible to download the trained model to a local machine?

mingyukang

If the model is trained on that pod, can it be saved or downloaded? if the data gets destroyed what is the point of the training? I see this has been asked here but no clear answer. Thanks!

aihome

I have a question. If you're working with proprietary data or private data (like PII), and you don't want to risk sending that data over the internet to Podman or OpenAI or whatever cloud based model, how would you fine tune your data? Is local training on your own local machine the only option?

RedShipsofSpainAgain

I am doing it now the uncensored was the push i needed 😁

pollywops

FWIW Ada is not pronounced "Ay-Dee-Ay;" it's "Ayda, " as in Ada Lovelace, acclaimed as the first programmer.

jwesley

I'm surprised you don't have more followers. Keep going!

joelzola

Heads up, you can click the copy icon to the right of label so that way you get a pretty paste.

SirajFlorida

Would you use this for only prototyping, or could they be left running reliably to be the hardware in a paid service?

autophilei

Hey Matthew,
Great Video! When can we expect a video about training our own LLM?

surajthakkar

You should add annotation that when setting up pod you should override persistent storage, because runpod sets persistent storage to 100Gb and it would eat up you budget very fast.

Uterr

thnx man best vid so far for me and my quest to actually get things done ;)

dik

it's the conten we deserve😭 everything is to the point, love this. especially i love your videos where you show us recent papers.
could i ask you a question about what computer characteristics should i have to use gpu cloud successfully? what characteristics of built in cpu and gpu do i need?

goldhydride

would like a video of how to train a model using those steps?

moon

Quick note: the block isn't actually the author for the models he just converts existing models to support llama.cpp

HampusAhlgren

Your tech tutorials are bar nun the best. clear concise with exact trouble shoot fixes. can you give a tutorial on how to us vs code with run pod through ssh. each time the server is to connect its asking for a password. I've gone through there trouble shoot but nothing is working.

kitrunner

But we need a tutorial on training!! That's what a lot of us need! I want to build my own models for my own business, so I need to figure out how to train the AI to have full understanding and data of what I'm doing. Is there any videos you can point me to so I can start learning how to train this AI?

Anarchy-Is-Liberty

Run ANY LLM Using Cloud GPU and TextGen WebUI (aka OobaBooga)

Run ANY LLM Using Cloud GPU and TextGen WebUI (aka OobaBooga)

How to Run Any LLM using Cloud GPUs and Ollama with Runpod.io

How To Run ANY LLM Using Cloud GPU and TextGen WebUI Easily!

Unleash Cloud GPUs (runpod) for Running any LLM

RUN TextGen AI WebUI LLM On Runpod & Colab! Cloud Computing POWER!

Run ANY Open-Source LLM Locally (No-Code LMStudio Tutorial)

Ollama on Linux: Easily Install Any LLM on Your Server

Build your own LLM on Google Cloud

How to Run a Local LLM on Raspberry Pi: Step-by-Step Guide to Deploy AI Models Locally

Run LLMs without GPUs | local-llm

Run Your Own LLM Locally: LLaMa, Mistral & More

Deploy LLM App as API Using Langserve Langchain

Deploy Any Machine Learning (or Deep Learning) Endpoint on Google Cloud Platform In 10 minutes

SkyPilot Installation and Running in LLM in Any Cloud

#3-Deployment Of Huggingface OpenSource LLM Models In AWS Sagemakers With Endpoints

How to run Mistral LLM locally on iPhone or iPad

Deploy ANY Open-Source LLM with Ollama on an AWS EC2 + GPU in 10 Min (Llama-3.1, Gemma-2 etc.)

NEVER buy from the Dark Web.. #shorts

Host your own LLM in 5 minutes on runpod, and setup APi endpoint for it.

what it’s like to work at GOOGLE…

My Jobs Before I was a Project Manager

Deploy FULLY PRIVATE & FAST LLM Chatbots! (Local + Production)

Go Production: ⚡️ Super FAST LLM (API) Serving with vLLM !!!

Run Uncensored LLAMA on Cloud GPU for Blazing Fast Inference ⚡️⚡️⚡️