How to avoid a single point of failure in distributed systems ✅

preview_player
Показать описание
A single point of failure(SPOF) in computing is a critical point in the system whose failure can take down the entire system. A lot of resources and time is spent on removing single points of failure in an architecture/design.

Single points of failure often pop up when setting up coordinators and proxies. These services help distribute load and discover services as they come and leave the system. Because of the critical centralized tasks of these services, they are more prone to being SPOFs.

One way to mitigate the problem is to use multiple instances of every component in the service. The graph of dependencies then becomes more flexible, allowing the system to resiliently switch to another service instead of failing requests.

Another approach is to have backups which allow a quick switch over on failure. The backups are useful in components dealing with data, like databases.

Allocating more resources, distributing the system and replication are some ways of mitigating the problem of SPOF. Hence designs include horizontal scaling capabilities and partitioning.

It is important to note that the CAP theorem does not allow removing SPOFs if perfect consistency is required.

Looking to ace your next interview? Try this System Design video course! 🔥

Course chapters:
1) Design an email service like Gmail
2) Design a rate limiter
3) Design an audio search engine
4) Design a calling app like WhatsApp
5) Design and code a payment tracking app like Splitwise
6) Machine coding a cache
7) Low level design of an event bus

The chapters have architectural diagrams and capacity estimates, along with subtitled videos. Use the coupon code of 'earlybird' to get a 20% discount.

References:
Рекомендации по теме
Комментарии
Автор

The Netflix example is good one. I saw their PyCon 2018 talk and they showed how they do regional failovers in under 7 minutes. It was a good talk.

WittyGeek
Автор

your positive energy makes me feel good! i feel like even i can get through an interview after watching you! excellent!

fchas
Автор

Elon Musk had already seen the Earth as a single point of failure and has been trying to create a slave on Mars. In that case, maybe Moon will a a Load Balancer.

semihkekul
Автор

Few more observations

All your examples use reverse proxies to achieve HA (except browser, tho not explicitly mentioned), but there are other techniques

1. Client side load balancing: Using service registry (a bit smart DNS) & smart clients to achieve HA. In your example, browser can be considered a smart client, but we have a lot more on the server-server communication to achieve HA in request-response flows
2. Also, your LB in a single zone is usually kept in a HA configuration using something like keepalived & a floating ip

songsenjoy
Автор

Prof. Gauran Sen,

learning a lot Sir. Thank you.

kumarakantirava
Автор

Very informative! Please keep making more videos!

manojmj
Автор

Hello Gaurav,
Thanks a million for sharing your knowledge and helping us.
Keeping your examples / explanation as simple as possible it could be makes you stand out.
Plz do add such small topics which is definitely useful otherwise it gets unnoticed in a larger video/topic.

karandutt
Автор

Hi Gaurav Thanks for the Videos Really Enjoying and Learning a lot from them. I have a question as you mentioned in the video when a load balancer fails we will overcome this problem by placing multiple load balancers and we keep all the IP's of the Load balancers in the DNS . But how the DNS knows whether the first Load balancer is working fine or not as DNS is simply just a name to address resolver and once it is done it will come into picture. and where do we write our logic saying that if loadbalancer1 fails contact loadbalancer2 or something like this.

venkatreddy
Автор

Sir,
It's wonderful of you to PIN that Question with your insightful I was struggling to understand how clusters can offer HIgh Availability for web sites.. Your DNS answer Enlightened me on lot of design prostrating to your feet.

kumarakantirava
Автор

Informative video. especially the meteorite scene. Awesome!!

sumitlahiri
Автор

Man, you literally helped me finish my assignment! Learned a lot. Great content. Thanks!

sachinakinapally
Автор

Re-phrase "More nodes" to "redundant nodes" to address Single Point of failures

ravindrababu
Автор

3:35 Gaurav - we don't have to worry about the Domain Name System (DNS) being an articulation point / SPOF ( Single Point of Failure ) since DNS is already a decentralized distributed system, correct? In a sense, we are already taking advantage of an existing scalable and resilient network architecture, correct?

harisridhar
Автор

Hey Gaurav, A big thanks for your efforts you are putting, I really learnt many things which was myths till now, you have have explained concepts in very simple terms, keep the good work.

abhikeshu
Автор

Thanks you Gaurav, we need to know these tiny information about each part.
Your vides are amazing and keep making videos on large systems and whenever you come up with some sub topic you can link those topics link in description using which one can master that topic before moving ahead.
Thanks a lot

tusharverma
Автор

Bro, all the very best in your new role at Uber, wishing you all success

influencer
Автор

Nice bro, I had to use this in recent development and I understood the concept... Thanks:)

suraj-gdqy
Автор

Chaos engineering is applied on application/node before going to production phase. Triggering controlled attacks and having ability to role back the attack to maintain original stable position.

saitejajonnadula
Автор

p*p is usually not the case when you have hot data issue that just migrates to the backup/replica and hoses that down as well due to a system that was not redesigned on time for the scale it now has to support. In that case the particular technology's ability to handle load becomes a single point of failure.

GeorgeChi
Автор

Very well expalined and great positive energy with gr8 smile :)...thnx buddy

prashantsrivastava