Distributed Metrics/Logging Design Deep Dive with Google SWE! | Systems Design Interview Question 14

preview_player
Показать описание
True sigma males don't care about your logs

00:00 Introduction
00:55 Functional Requirements
01:42 Capacity Estimates
02:41 Database Design
03:36 Architectural Overview
Рекомендации по теме
Комментарии
Автор

best part is that you really focus on 'why' aspects of things, keep rocking !!

rahulaga
Автор

Thanks for the videos. I think one important functional requirement that most logging solutions offer ( GCP logging ...) is text search. So potentially having a text search engine (ELS) is something to consider.

firoufirou
Автор

This video really helped me in one of my interviews, thanks a lot!!

advaitchabukswar
Автор

Good one Jordan. You are very clear in your thoughts. Keep this going !! :) Metrics/Logging system is challenging because of both high scale writes/reads. It looks like the write scale here depends on how much Kafka can scale. If we are looking at a very active public service receiving 100 billion msgs/day (1000 msgs/sec), I am guessing Kafka can handle that ? What about read load ? Since lot of people may use the log for customer investigations, there could be a lot of read load on the time series DB since the other path is for batch insights. As I am typing this, I am thinking about splunk. Could you make a video on how to design splunk like system ? (May be these are the building blocks)

prashantbharadwaj
Автор

High quality content. Keep doing these videos 👍

saisreenath
Автор

Thanks for the amazing video.
In one of the interview I was asked to design flight recorder to record the data within a flight. Could you please make a video on that.

helperclass
Автор

awesome content! sorry for skewing your metrics towards the other 97% but that was funny as ...! 😅

raysdev
Автор

My suggestion to keep this channel going would be to get ripped and document your fitness journey. Just a thought :D

VyasaVaniGranth
Автор

hey Jordan, what prevents us from sending the unstructured data directly from the client to the S3? If we do not care about data enrichment we might as well just send it straight from the client, unless I'm missing something?

also a couple of follow up questions just to clarify it for myself:
- why do we need a logging service, why can't we just push the data from the client straight to the queue?
- as far as I understand we leverage Timeseries DB for queries on relevant "recent" data, so I assume we would need some sort of clean up jobs that run periodically? And we use data warehouse (like Snowflake) to enable analytical queries that would be too big to run on our main DB?

dind
Автор

I quite hate system design interviews and regurgitating proper nouns I’ve never engaged with, think I’ve chosen the wrong career

bryanbrianbrian
Автор

Thank you for the amazing content! Instead of S3, can we use Cassandra? what would be the trade offs?

ShreyaGupta-nctd
Автор

Thank you for the amazing content! Can we use Cassandra instead of S3? What would be the trade offs?

ShreyaGupta-nctd
Автор

Thank you, how did you manage to grasp systems design in such a short time? What is your approach of studying?

TheImplemented
Автор

Thanks for this!
Is flink consumer just like a normal java/spring queue consumer that is monitoring a AWS kinesis stream? (I've never used flink/kafka.)
Do we have to use flink in conjunction with kafka queues or would any service work?

rajrsa
Автор

Hey, how about using Apache Pinot or druid to support better querying capabilities directly on the real time data?

tavneet
Автор

Once we have the data in the time series DB, how do you suppose we go about hooking up a monitoring/alerting service to it? I'm not sure what the optimal route is between 1. push based model where for every new metric (or batch) in the time series DB, we query an alarms/rules DB, or 2. pull based model where the alarming service periodically queries the time series DB for all alarms/rules in the DB. 1 seems excessive since majority of real time metrics aren't going to fire an alarm. 2 seems excessive in that most alarms aren't firing at a given instance.

calvio
Автор

can we use HDFS instead of S3? that way we'll achieve data locality and will be part of hadoop cluster? - will be cheaper as well?

KathaPatel-om
Автор

How can your single leader replication in TimeSeries DB handle the enormous amount of writes ? Won't it be overwhelming for that single leader ?

Piyush-kyee
Автор

Hi Jordan - Thanks for this video. Do you mind sharing which Pinterest video you referred to in this design?

sumeet