How to avoid a single point of failure in distributed systems ✅

Показать описание

A single point of failure(SPOF) in computing is a critical point in the system whose failure can take down the entire system. A lot of resources and time is spent on removing single points of failure in an architecture/design.

Single points of failure often pop up when setting up coordinators and proxies. These services help distribute load and discover services as they come and leave the system. Because of the critical centralized tasks of these services, they are more prone to being SPOFs.

One way to mitigate the problem is to use multiple instances of every component in the service. The graph of dependencies then becomes more flexible, allowing the system to resiliently switch to another service instead of failing requests.

Another approach is to have backups which allow a quick switch over on failure. The backups are useful in components dealing with data, like databases.

Allocating more resources, distributing the system and replication are some ways of mitigating the problem of SPOF. Hence designs include horizontal scaling capabilities and partitioning.

It is important to note that the CAP theorem does not allow removing SPOFs if perfect consistency is required.

Looking to ace your next interview? Try this System Design video course! 🔥

Course chapters:
1) Design an email service like Gmail
2) Design a rate limiter
3) Design an audio search engine
4) Design a calling app like WhatsApp
5) Design and code a payment tracking app like Splitwise
6) Machine coding a cache
7) Low level design of an event bus

The chapters have architectural diagrams and capacity estimates, along with subtitled videos. Use the coupon code of 'earlybird' to get a 20% discount.

References:

Рекомендации по теме

Комментарии

The Netflix example is good one. I saw their PyCon 2018 talk and they showed how they do regional failovers in under 7 minutes. It was a good talk.

WittyGeek

your positive energy makes me feel good! i feel like even i can get through an interview after watching you! excellent!

fchas

Elon Musk had already seen the Earth as a single point of failure and has been trying to create a slave on Mars. In that case, maybe Moon will a a Load Balancer.

semihkekul

Few more observations

All your examples use reverse proxies to achieve HA (except browser, tho not explicitly mentioned), but there are other techniques

1. Client side load balancing: Using service registry (a bit smart DNS) & smart clients to achieve HA. In your example, browser can be considered a smart client, but we have a lot more on the server-server communication to achieve HA in request-response flows
2. Also, your LB in a single zone is usually kept in a HA configuration using something like keepalived & a floating ip

songsenjoy

Prof. Gauran Sen,

learning a lot Sir. Thank you.

kumarakantirava

Very informative! Please keep making more videos!

manojmj

Hello Gaurav,
Thanks a million for sharing your knowledge and helping us.
Keeping your examples / explanation as simple as possible it could be makes you stand out.
Plz do add such small topics which is definitely useful otherwise it gets unnoticed in a larger video/topic.

karandutt

Hi Gaurav Thanks for the Videos Really Enjoying and Learning a lot from them. I have a question as you mentioned in the video when a load balancer fails we will overcome this problem by placing multiple load balancers and we keep all the IP's of the Load balancers in the DNS . But how the DNS knows whether the first Load balancer is working fine or not as DNS is simply just a name to address resolver and once it is done it will come into picture. and where do we write our logic saying that if loadbalancer1 fails contact loadbalancer2 or something like this.

venkatreddy

Sir,
It's wonderful of you to PIN that Question with your insightful I was struggling to understand how clusters can offer HIgh Availability for web sites.. Your DNS answer Enlightened me on lot of design prostrating to your feet.

kumarakantirava

Informative video. especially the meteorite scene. Awesome!!

sumitlahiri

Man, you literally helped me finish my assignment! Learned a lot. Great content. Thanks!

sachinakinapally

Re-phrase "More nodes" to "redundant nodes" to address Single Point of failures

ravindrababu

3:35 Gaurav - we don't have to worry about the Domain Name System (DNS) being an articulation point / SPOF ( Single Point of Failure ) since DNS is already a decentralized distributed system, correct? In a sense, we are already taking advantage of an existing scalable and resilient network architecture, correct?

harisridhar

Hey Gaurav, A big thanks for your efforts you are putting, I really learnt many things which was myths till now, you have have explained concepts in very simple terms, keep the good work.

abhikeshu

Thanks you Gaurav, we need to know these tiny information about each part.
Your vides are amazing and keep making videos on large systems and whenever you come up with some sub topic you can link those topics link in description using which one can master that topic before moving ahead.
Thanks a lot

tusharverma

Bro, all the very best in your new role at Uber, wishing you all success

influencer

Nice bro, I had to use this in recent development and I understood the concept... Thanks:)

suraj-gdqy

Chaos engineering is applied on application/node before going to production phase. Triggering controlled attacks and having ability to role back the attack to maintain original stable position.

saitejajonnadula

p*p is usually not the case when you have hot data issue that just migrates to the backup/replica and hoses that down as well due to a system that was not redesigned on time for the scale it now has to support. In that case the particular technology's ability to handle load becomes a single point of failure.

GeorgeChi

Very well expalined and great positive energy with gr8 smile :)...thnx buddy

prashantsrivastava

How to avoid a single point of failure in distributed systems ✅

How to avoid a single point of failure in distributed systems ✅

How to stay Classy and Elegant whilst SINGLE | Behaviours to Avoid as a Single Woman

Single Member LLC Mistakes You Should Avoid - 4 Biggies

This is why you have single strand knots | How to avoid Fairy Knots | Natural Hair Length Retention

Top 3 Single Member LLC Mistakes | How to Avoid One Owner LLC Mistakes

HOW TO AVOID SINGLE PARENTHOOD 💯 @whatever

3 Types of Single Mothers Men Should Avoid: And the 2 Other Types You Can Consider Marrying

Not a single straight line.

Avoid single mothers, especially those with a girl child.

HOW TO AVOID SINGLE STRAND KNOT, SHEDDING, BIG CHOP: MY ADVICE TO START HEALTHY NATURAL HAIR JOURNEY

How to AVOID being SINGLE in your 50s #shorts #relationshipadvice

How to Avoid Single Point of Failure in your Business Production

“Not Making That Mistake Again” High Value Men WON’T Date Single Moms DEBATE 🙀

How to avoid single use plastics

5 Types Of SINGLE MOMS You Should AVOID! (True Story of Single Momma Drama)

I will not part with a single coin

how to manage + avoid single strand knots | natural hair journey

3 Reasons You Should NOT Date A Single Dad - (Coming From A Single Dad *HONESTLY*)

Modern Women Mad as Men Are NOT Attending Single Event

500 Million, But Not A Single One More

Avoid Doing THIS If You Don't Want To Be Single Forever

How to avoid any single point of failure in your Amazon EKS clusters - CNDM Day Spring 2022

Python is NOT Single Threaded (and how to bypass the GIL)

Reason # 97 Why Men Should AVOID Single Moms

3 Reasons You Should NOT Date A Single Dad - (Coming From A Single Dad HONESTLY)