Kubernetes Tutorial: Why Do You Need StatefulSets in Kubernetes?

preview_player
Показать описание


Before we talk about stateful sets, we must first understand why we need it. Why can’t we just live with Deployments?

Let’s start from the very basics. So for a minute let’s keep aside everything that we learned so far, such as Deployments, or Kubernetes, or Docker or containers or virtual machines.

Let’s just start with a simple server. Our good old physical server. And we are tasked to deploy a database server. So we install and setup MySQL on the server and create a database. Our database is now operational. Other applications can now write data to our database.

To withstand failures we are tasked to deploy a High Availability solution. So we deploy additional servers and install MySQL on those as well. We have a blank database on the new servers.

So how do we replicate the data from the original database to the new databases on the new servers. Before we get into that,

So back to our question on how do we replicate the database to the databases in the new server.

There are different topologies available.
The most straight forward one being a single master multi slave topology, where all writes come in to the master server and reads can be served by either the master or any of the slaves servers.

So the master server should be setup first, before deploying the slaves.

Once the slaves are deployed, perform an initial clone of the database from the master server to the first slave. After the initial copy enable continuous replication from the master to that slave so that the database on the slave node is always in sync with the database on the master.

Note that both these slaves are configured with the address of the master host. When replication is initialized you point the slave to the master with the master’s hostname or address. That way the slaves know where the master is.

Let us now go back to the world of Kubernetes and containers and try to deploy this setup.

In the Kubernetes world, each of these instances, including the master and slaves are a POD part of a deployment.

In step 1, we want the master to come up first and then the slaves. And in case of the slaves we want slave 1 to come up first, perform initial clone of data from the master, and in step 4, we want slave 2 to come up next and clone data from slave 1.

With deployments you can’t guarantee that order. With deployments all pods part of the deployment come up at the same time.

So the first step can’t be implemented with a Deployment.

As we have seen while working with deployments the pods come up with random names. So that won’t help us here. Even if we decide to designate one of these pods as the master, and use it’s name to configure the slaves, if that POD crashes and the deployment creates a new pod in it’s place, it’s going to come up with a completely new pod name. And now the slaves are pointing to an address that does not exist. And because of all of these, the remaining steps can’t be executed.

And that’s where stateful sets come into play. Stateful sets are similar to deployment sets, as in they create PODs based on a template. But with some differences. With stateful sets, pods are created in a sequential order. After the first pod is deployed, it must be in a running and ready state before the next pod is deployed.

So that helps us ensure master is deployed first and then slave 1 and then slave 2. And that helps us with steps 1 and 4.

Stateful sets assign a unique ordinal index to each POD – a number starting from 0 for the first pod and increments by 1.

Each Pod gets a name derived from this index, combined with the stateful set name. So the first pod gets mysql-0, the second pod mysql-1 and third mysql-2. SO no more random names. You can rely on these names going forward. We can designate the pod with the name mysql-0 as the master, and any other pods as slaves. Pod mysql-2 knows that it has to perform an initial clone of data from the pod mysql-1. If you scale up by deploying another pod, mysql-3, then it would know that it can perform a clone from mysql-2.

To enable continuous replication, you can now point the slaves to the master at mysql-0.

Even if the master fails, and the pod is recreated is created, it would still come up with the same name. Stateful sets maintain a sticky identity for each of their pods. And these help with the remaining steps. The master is now always the master and available at the address mysql-0.

And that is why you need stateful sets. In the upcoming lectures, we will talk more about creating stateful sets, headless services, persistent volumes, and more.

#KubernetesTutorial #Kubernetes
Рекомендации по теме
Комментарии
Автор

nice explanation - but small request, please use night-mode for the video as most developers prefer night mode :)

sk
Автор

Unfortunately I have only one like button. Thanks for the crystal clear explanation.

neverforgetsamit
Автор

Excellent explained Mumshad . You're a gem 💎

soumyadipchatterjee
Автор

Hi Mumshad,
Thanks for the great session. As you said in Statefulset if any POD goes down, it comes up with same name I.e. MySQL-0 in your example, how does this newly constructed pod sync the data? It might have lost all the data when it went down, even if it uses persistent storage, it might have missed some new data during the period when it was down??

ashwani
Автор

Mumshad you are cool. Nice set of videos. Thanks for making it so simple and easy to understand.

BipinJethwani
Автор

good explanation. Question - how does K8 preserve statefulness if a stateful pod crashes and is replaced by another pod?

samitjain
Автор

Very good explaination...Thanks mumshaad...

shantanuparanjpe
Автор

Hi Mumshad.
Your explanation is awesome..

I have one doubt here on how pods get dynamically assigned to IP address.
which component in K8S allocates the IPs to PODs.
Kindly answer.

jashvasabbu
Автор

sir its understanding but with practical will give me the much understanding

devareddy
Автор

This is part of which course cka ? Or which coruse in kk?

deva_
Автор

Your stateful set has different capabilities. The master (mysql-0) in your example is writable where the others aren't.
As you have differences why not 2 charts. mysql-master with a replica count of 1 and mysql-replica with a replica count that can scale? For the replica's connecting to the master use the service name instead of the hostname?

DontScareTheFish
Автор

Hi Sir - Appreciate if you can provide with another simple example if possible. I think this is quite complicate one .

anilkommalapati
Автор

Hello Sir, Nice video. Please share any video's on Helm charts getting started with Kubernetes clusters, , Thanks...

gangadharrao
Автор

Need your help. I have created "nfs server pod" instead of local nfs installation then created the PV. In PersistentVolume, I have mentioned the NFS server's service name
like:
apiVersion: v1


kind: PersistentVolume
metadata:

name: nfs
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
nfs:
server:

path: "/

NOW I want to bind this PersistentVolume in Statefulset VolumeClaimTemplate in GCP.. How do we mention it?
I gave like this in statefulset
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
the problem is if I gave the above example it will be created a new one, not bound with my PV... I hope you understand what I am trying to say.. else let me know. Thanks in advance

fz-ssaravanan
Автор

thank you, i love you. May Allah bless India.

reardeltoit
Автор

Why Master sand Slaves ? and not ( "Primary pods" and "secondary pods" ? Is this a cultural BIAS ???
Is it ethic to do so ?

aboubacaralaindioubate