Distributed Logging System Design | Distributed Logging in Microservices | Systems Design Interview

preview_player
Показать описание
System Design | #SystemDesign :
Distributed Logging Systems Design is a very common use case these days, with more and more company migrating to microservice architecture.
Since a user request can pass through various microservices to achieve the goal, it is really important to design the logging in a way, that a request can be traced from end to end.
This is really necessary when debugging the code or looking for a bug raised by the user.
In system like payment system, you cannot ask a user to redo the transaction again, so that you can trace the flow in real time, hence system like distributed logging becomes very important.
In this video I will be covering distributed log tracing in microservices and will show you how I came up with the architecture of distributed logging.

In this video I have covered various approach to design a distributed logger and final concluded with the best approach.

This is also a very common system design interview questions

#systemDesign #TheTechGranth #DistributedLogging

Рекомендации по теме
Комментарии
Автор

I would recommend the database choice as cassandra, because the system is write heavy and cassandra works really well and it has good integration compatibility with ELK tool as well as Hadoop, spark and scala for some analytics purpose. please correct me if i am wrong.

saurabhdubey
Автор

Hi
Firstly Thank you for sharing your knowledge with all and putting effort to create a content to explain it better.
I have a question w.r.t one of the decision of the design
1. It was mentioned that there are multiple services such as ( Log Aggr Service, Error Agg, Alert service) are individually polling the Distributed queue. Do you think this is a preferred approach? I think it is more better to have a single service polling the queue and latter pass that request to the above mentioned individual services. To my understanding a message when polled from the queue is locked till it is deleted.
This way the queue avoid the duplicate processing of the same request.

Considering this scenario, Say Log Aggr Service polls for a message and finds its not of its interest. It has to stop further processing of the message.
And I am not sure how we will avoid the scenario that when the Log Aggr Service polls for the message it wont pick the same previous message?

There are few such scenarios that we have to take care, according to my understanding

Thank you again for your effort, Please consider above points if you feel those add value. Thank you

jatinkumar
Автор

i prefer using a buffered file locally with a log collection agent. So that the main service threads are not blocked. Sending a log doesn't need 100% reliability anyway.

kobew
Автор

Could you please make on video on Monitoring like Prometheus including alerting (e.g. Alert Manager)? Thanks for sharing the video

suchismitagoswami
Автор

Would want to know, when he said that we can replace Distributed Queue with a central DB, so what DB is he referring?
ElasticSearch DB? MySQL? Postgresql? Redis?

crimsoncad
Автор

here queue becomes a single point of failure right ? How we handle it ?

anandt
Автор

do we need a agent to send data from nodes to kafka, or should our application handle the logic of sending data to kafka.

SumanthReddyAdudoodla
Автор

@ayush abhishek - so basically this is how newrelic grafana sentry and kibana log works.. ?

shaileshagarwal
Автор

it is time-series data so better use cassandra with 2-layer sharding. no point using sql.

obamabinladen