What is LOAD BALANCING? ⚖️

preview_player
Показать описание
Load Balancing is a key concept in system design. One simple way would be hashing all requests and then sending them to the assigned server.

The standard way to hash objects is to map them to a search space and then transfer the load to the mapped computer. A system using this policy is likely to suffer when new nodes are added or removed from it.

One term you would hear in system design interviews is Fault Tolerance, in which case a machine crashes. And Scalability, in which devices must be added to process more requests. Another term used often is request allocation. This means assigning a request to a server.

Load balancing is often tied with service discovery and global locks. The type of load we want to balance is that of sticky sessions.

Looking to ace your next interview? Try this System Design video course! 🔥

00:00 Load Balancing - Consistent Hashing
00:33 Example
01:29 Server-Client Terms
02:12 Scaling
02:40 Load Balancing Problem
03:58 Hashing Requests
06:37 Request Stickiness
08:00 Splitting the Pie
10:35 Request Stickiness
13:29 Next Video!

With video lectures, architecture diagrams, capacity planning, API contracts, and evaluation tests. It's a complete package.

References:

System Design:

#LoadBalancer #Proxy #SystemDesign
Рекомендации по теме
Комментарии
Автор

For anyone confused by the pie chart, the explanation he does makes sense only when you watch the whole video. In a nutshell, when you at first have 4 servers, each server handles 25% of the users. The hashing function takes users' user id or some other information that somehow encapsulates the user data (and is consistent) so any time you want to, for example, fetch a user profile you do it via the same server over and over again since user id never changes (therefore hash of a user id never changes and will always point to the same server). The server remembers that and it creates a local cache for that information for that user so that it doesn't have to execute the (expensive) action of calculating user profile data, but instead just fetches it from the local cache quickly instead. Once your userbase becomes big enough and you require more processing power you will have to add more servers. Once you add more servers to the mix, the user distribution among server will change. Like in the example from the video, he added one server (from 4 servers to 5 servers). Each server needs to now handle 20% of the users. So here is where the explanation for the pie chart comes from.

Since the first server s0 handles 25% of the users, you need to take that 5% and assign it to 2nd server s1. The first server s0 no longer serves the 5% of the users it used to, so the local cache for those users becomes invalidated (i.e. useless, so we need to fetch that information again and re-cache it on a different server that is now responsible for those users). Second server s1 now handles 25%+5%=30% of the traffic, but it needs to handle 20%. We take 10% of its users and assign it to the third server s2. Again like before, the second server s1 lost 10% of its users and with it the local cache for those users' information becomes useless. Those 10% of users become third server's users, so the third server s2 handles 25%+10%=35% of the traffic. We take third server's 15% (remember, it needs to handle only 20%) and give it to the fourth server s3. Fourth server now handles 25%+15%=40% of the traffic. Like before, fourth server lost 20% of its users (if we're unlucky and careless with re-assignment of numbers it lost ALL of its previous users and got all other servers' users instead) and therefore those 20% of users' local cache becomes useless adding to the workload of other servers. Since fourth server handles 40% of the traffic, we take 20% of its users and give it to the new fifth server s4. Now all servers handle users uniformly but the way we assigned those users is inefficient. So to remedy that, we need to look at how to perform our hashing and mapping of users better when expanding the system.

bithon
Автор

I'm a UX Designer, irrelevant for me to know this but just watching your video and sitting here willing to complete the whole series, thats how brilliant you are explaining chap.

Karthikdravid
Автор

If anyone is confused by the pie diagram,
We need to reduce the distribution from 25 each to 20 each. So we take 5 from the first server and merge it with the second one. Then we take 10(5 from the first one and 5 from the second one) and merge it with third. So now, both one and two have 20 each. Then we go on taking 15 from third and merging it with fourth and finally, taking 20 from four to create the fifth server's space.

Please correct me if I'm wrong. This is just a simple breakdown which I think is what he intended

nxpy
Автор

Notes to self:
* Load balancing distributes requests across servers.
* You can use `hash(r_id) % n_servers` to get the server index for a request `r_id`.
-> Drawback: if you add an extra server `n_servers` changes and `r_id` will end up on a different server. This is bad because often we want to map requests with the same ids consistently to the same servers (there could e.g. be cached data there that we want to reuse).
* "Consistent hashing" hashes with a constant denominator `M`, e.g. `hash(r_id) % M`, and then maps the resulting integer onto a server index. Each server has a range of integers that map to their index.
* The pie example demonstrates, that if an extra server is added, the hashing function stays the same, and one can then change the range-to-server-index mapping slightly so that it remains likely that an `r_id` gets mapped to the same server as before the server addition.

UlfAslak
Автор

I love how you've taken the time and effort to teach complex topics in a simple manner with real world examples. You also stress on words that are important and make analogies. This helps us students remember these topics for life! Thank you and really appreciate the effort!

manalitanna
Автор

First of all, this is a perfect lesson and I have absorbed 100% of it as a school student. the pie chart was little confusion at first cus with 4 servers it's like 25 Buckets and then you added 1 server it's pretty much 20 - 5 buckets. so divide pie to 5 and mark each one 20 buckets is the easiest way

proosdisanayaka
Автор

This channel is a total gem. I don't think i've seen anything similar on youtube in regards to quality. Really appreciated it !

valiok
Автор

That's an eye-opener. Have been working in industry for a few years now but never realised how small changes like this can affect the system. Thank you so much for the content!

ruchiragarwal
Автор

Your way of explanation with real-life examples is really effective. I can visualize everything and remember it easily. Thanks for this.

sadiqraza
Автор

Notification Squad :-)
The problem with having awesome teachers like Gaurav Sir, is that you want the same ones to teach you in college too :-) Thanks Gaurav sir for System Design series.

Codearchery
Автор

seems like a 4th-year senior teaching the juniors...your videos are Beast :)

sumitdey
Автор

Hey man, cool explanation for all the advanced system design learners... nice! keep it coming!!

KarthikaRaghavan
Автор

You seem very passionate about the subject.
It makes it 10x better to learn that way.
Thank you.

sagivalia
Автор

This was really insightful, I thought load balancing was simple but the bit about not losing your cached data was something I didn't know before.

aryanrahman
Автор

This method is ok when accessing a cache and the problems that arise are somewhat mitigated by consistent hashing.

However there are two thing i want to point out
1. Caching is typically done using a distributed cache like memcached or redis and the instances should not cache too muchinformation.
2. If you want to divert requests from a particular request id then you should configure your load balancer to use sticky sesssions . The mapping between the request id and the ec2 instance can be stored in a git repo or cookies can be used etc.

mukundsridhar
Автор

You give great explanations !! And little video editing efforts make the video so interesting. Going to watch all videos uploaded by you.

vanshpathak
Автор

Man, thanks! You made this easy to 100% understand. Your teaching style is excellent!

ryancorcoran
Автор

Hey Gaurav, in the first part of the video you mention and by taking the mod operation you distribute the requests uniformly . What kind of assumptions do you make ( and why) on the hash function which insures that the requests are uniformly distributed . I could come up with a hash function which would send all the requests to say server 1

vinayakchuni
Автор

I subscribed/bookmarked your channel I don't know when I knew that I'll need it at some point in time and that time is now. Thank you for the series...❤️❤️

Varun-msiv
Автор

Hey Gaurav,

You have a knack for explaining things in a very simple manner (ELI5).

There is one part of this discussion which I feel conveys some incorrect information (or I might have understood it incorrectly). You mention that 100% of requests will be impacted on addition of a new server. However, I believe that only 50% of the requests should be impacted (server 1 retains 20% of its original requests, server 2 15%, server 3 10%, and server 4 5%).

In fact, it's always exactly 50% of the requests that are impacted on addition of 1 new server irrespective of the number of original servers. This turned out to be a pretty fun math problem to solve (boils down to a simple arithmetic progression problem at the end).

The reason your calculation results in a value of 100% is because of double calculation: Each request is accounted for twice, once when it is removed from the original server, and then again when it is added to the new server.

ShubhamShamanyu