Host your own LLM in 5 minutes on runpod, and setup APi endpoint for it.

preview_player
Показать описание

Please note if you are using this for anything other than testing you should restrict access with an APi key.
Рекомендации по теме
Комментарии
Автор

I couldn't run it the whole day yesterday, your steps by steps approach is wonderful. Thank you.

kaynkayn
Автор

I was really looking for something like this. Thank you so much. Can u make a video on how to use Agentkit by BCG

prnmdid
Автор

Hi Thomas, can you provide guidance on how we select the GPU based on the model we would like to test? For example, if I want to test Goliath 120b at reasonable speeds, how do I know which GPUs to deploy? Thanks.

nxuhdbg
Автор

Which is the most cost effective way to host our LLMs on Runpod, using serverless or using a runpod?
Use case: no production level, just testing different LLMs, even in some autonomous agent networks, which can burn money pretty quickly using gpt-4, so running local LLMs on Runpod some times a day for some hours, should not be always on and instance should not spin up very quickly...
I think serverless is the better for this use case, but I'm not sure, so what is your opinion?

attilavass
Автор

Is this not a bit slow for a 7B model running on a (freakin!) H100 ?
Getting roughly same speed here with a RTX2070 and 5bit quantized 7B models ...

Thanks for the tutorial tho, was gonna look into runpod.

nemai
Автор

I keep getting HTTP service not ready for the ports. Is there an additional step required for this?

kelv
Автор

i cant connect to http port 7860. it says its not ready. and also on the logs i am getting this error "AttributeError: module 'gradio.layouts' has no attribute '__all__". can you help please.

emiryuce
Автор

im getting a 405. I dont think i used the blokes template with API also on thats why I guess.

obygknu