Deploying machine learning models on Kubernetes

preview_player
Показать описание
In this video, we will go through a simple end to end example how to deploy a ML model on Kubernetes. We will use an pretrained Transformer model on the task of masked language modelling (fill-mask) and turn it into a REST API. Then we will containerize our service and finally deploy it on a Kubernetes cluster.

Code from the video:

00:00 Intro
00:22 3 step procedure diagram
01:42 Existing framework overview
02:09 Creating an API
09:25 Containerization
13:53 Containerization - custom platform
15:47 Preparing a minikube K8s cluster
17:43 K8s: Deployment and service
21:31 K8s: 2 cool features - self-healing and load balancing
26:00 Outro

Credits logo animation
Рекомендации по теме
Комментарии
Автор

Always a pleasure to watch someone as talented as you! Keep it up :)

ludwigstumpp
Автор

Great example. Thanks for the information

mmacasual-
Автор

Really helpful for foundation on ml ops

shivendrasingh
Автор

You're great. Thanks for sharing this in such a nice way.

JoseMiguel_____
Автор

great video thanks a lot really liked the explanation !!!.

aditya_
Автор

Would appreciate a video using VScode to include docker contain files, k8s file and Fast API

dsshujo
Автор

really nice video. Would you see any benefit of using the deployment in a single node with M1 chip? I'd say somehow yes because an inference might not be taking all the CPU of the M1 chip, but how about scaling the model in terms of RAM? one of those models might take 4-7GB of RAM which makes up to 21GB of RAM only for 3 pods. What's you opinion on that?

davidpratr
Автор

Amazing video. In min 5:25 how did you do to open the second bash in the console? I was searching for a long time and I can't find anything. Thanks and regards!

unaibox
Автор

Hi, I would like to use GPU to accelerate this demo, can you give me some tips? Thank you

zhijunchen
Автор

Look forward to show your face alot :))

kwang-jebaeg
Автор

I am having a problem in the min 18:00 the model load is being killed all the time. I tried to "minikube config set memory 4096" but still having the same problem. Any idea? I've been looking for a solution for 3 hours and there is no way

unaibox
Автор

What terminal application is this, with the different panels?

alivecoding
Автор

the reason you got ., ? as the output for [MASK] because you didn't end your input request with a full stop. Bert Masking Models should be passed that way. "my name is [MASK]." should have been your request.

SunilSamson-wl