Hugging Face SafeTensors LLMs in Ollama

preview_player
Показать описание
In this video, we're going to learn how to use Hugging Face safetensors models with Ollama on our own machine.
We'll also learn how to quantize the model to reduce the memory required and increase the number of tokens generated per second.

#llms #ollama #safetensors

Рекомендации по теме
Комментарии
Автор

I can't get find help about 'Error: llama runner process has terminated: error loading model: check tensor dims: tensor 'token_`embd.weight' has wrong shape; expected 4090, 128257, got 4096, 128256, 1, ' can you assist me?

kylekwon
Автор

Hello, Thanks for the great videos. It's been about several ours I am browsing in your channel. Just a question is it possible to use Ollama and doing fine-tuning with that?

saramirabi
Автор

how to tinstall modeldownloader? i try git clone, and then try hfdownloader in cmd, its still error its not recognized as an internal or external command. thx

bocilmillenium
Автор

hello, im stuck at the quantize part can you help? I'm using terminal on macos with ollama. pls send me the terminal commands to quantize safetensors llm with the create -q command on ollama(Q5_K_M). thank you

janithaoshan
Автор

Thank you so much. I am having problem running models downloaded from hugging face having safetensor file. I have these files in I have to use this for ollama. I followed everything, even created modelfile with path to safetensor directory, but it is not running >> ollama create model_name -f modelfile. Please help me.

parthwagh
Автор

Hi, I get error "Error: unknown data type: U8", has anyone solved similar problems?

ghrvinh
Автор

I can't get find help about 'Error: llama runner process has terminated: signal: aborted' can you assist me?

ZenitoGR
Автор

i keep getting incorrect function, any advice?

generolas