Ollama on Linux: Easily Install Any LLM on Your Server

preview_player
Показать описание
Ollama has just been released for linux, which means it's now dead simple to run large language models on any linux server you choose. I show you how to install and configure it on digitalocean.

00:00 Installation on DigitalOcean
03:30 Running Llama2 on a Server
05:43 Calling a Model Remotely
12:26 Conclusion

#llm #machinelearning

Support My Work:

Gear I use:

As an affiliate I earn on qualifying purchases at no extra cost to you.
Рекомендации по теме
Комментарии
Автор

Thanks for leaving all the errors in and correcting them. Excellent.

crazytom
Автор

just what i was looking for, thanks ian!

DataDrivenDailies
Автор

This is amazing news! I'm limited to 16gb RAM on my Macs, but not so on my Linux machines!

sto
Автор

Was using Ubuntu Desktop running mixtral on ollama so i can make api calls with my FastApi app on VS code but realized i should separate them out and go headless for ollama. I didn’t realize that CORS was preventing outside calls from my dev machine and this video helped once i found the github page as well. Thanks for sharing

datpspguy
Автор

i cant run it on service ollama start, it says the following:
$sudo: service ollama start
ollama: unrecognized service

trapez_yt
Автор

Mistral 7B running really sweet on my old Asus (16GB ram ) laptop

timjx
Автор

This was a really helpful video Ian!
But I am facing one issue after running ollama serve the server is shutting down when I close terminal. Please tell me if there is a way to prevent this.

Thanks!

rishavbharti
Автор

can we use ollama to serve in production ? if not, what is your suggestion?

PengfeiXue
Автор

Hello. I'm developing an OnPremises application that consumes Ollama via API. However, after a few minutes, the Ollama Server stops automatically. I would like to know if there is any way to keep it running until I stop it.
Thank you very much.

BileGamer
Автор

for 70B model, what server would I need to rent? docs says at least 64GB of RAM... but regarding NVIDEA card no minimal specs in the docs. Who has experience with this?

ITworld-gwiy
Автор

Run Pod is very affordable too. From 17c per hour for a Nvidea 3080

Gee
Автор

how does this scale for multiple users sending multiple requests at a time? do you need to use a load balancer / reverse proxy? i don't think ollama supports batch inference still

atrocitus
Автор

Which version of Ubuntu did you choose? It seems to be missing from the video.

JordanCassady
Автор

Anyone got this running on anything lower than 8GB of RAM on digital ocean? I tried locally on my own computer with a huge prompt with a 3B model, and it only used around 1GB of RAM maximum

jamiecropley
Автор

I got an error while executing the curl command : Failure writing output to destination

peteprive
Автор

hello Ian, Its a very great video. I have some query, i will very thankful if you can help me. I am stuck since 3 days. Apparently, I am trying to host the ollama on my server. i am very new to linux and dont understand the whats wrong i am doing. I am using nginx to host the ollama on my proxies and configure the nginx file and yet getting access denied error. I can show you the code if you want, please respond.

AdarshSingh-rmer
Автор

How do you connect to server via Python Client or Fast APIs for integration with projects/notebook?

SuperRia
Автор

0:08 How did you get to your pronunciation of Linux?
10:53 How could one correct the error occurring here?

VulcanOnWheels
Автор

How come the model run in 8gb of ram? On the docs it self it need at least 16gb for llama2

sugihwarascom
Автор

do you think it is safe to install on your own laptop instead of the cloud server?

wryltxw