How to NOT Fail a System Design Interview (By a Data Engineer)

preview_player
Показать описание
This is what a systems design interview at Google, Amazon, Facebook or any other big tech company looks like. Systems design interviews are vague and tough to crack, but with the right approach, you can nail it! Sharing my experience and strategy to do just that here!

Resources:

Follow me here:

Thanks for watching the video! What did you think about it? Drop your comments! Don't forget to like and subscribe :)

0:00 Pre Intro
1:41 Intro and Disclaimer!
3:49 Application design questions
6:25 Application design solution 1
9:22 Application design solution 2
12:00 Data pipeline design questions
14:21 Data pipeline design solution
17:29 Resources and preparation
19:13 What next?

#interview #google #technology #cloud #bigdata
Рекомендации по теме
Комментарии
Автор

We have just reached 1000 subs! And it's all thanks to you super awesome subscribers and viewers! ❤️

If you have not subscribed already, what are you waiting for?

Drop your questions or suggestions below! Let's talk :)

JashRadia
Автор

awesome video jash, generally nobody talks about piepline designing and sytem design for the DE domain. Great sharing of resources, great applaud to you

brownwolf
Автор

Awesome video, really helpful.

I found these questions related to system design which are usually asked in the interviews. If you could share some insights on these would be really helpful.

🌊 Data Pipeline Design: How would you design a data pipeline to handle large volumes of streaming data? (e.g., IoT devices or website clickstreams)

⚖ Batch vs. Stream Processing: Explain the differences between batch and stream processing and when to use each in data systems.

🏢 Data Warehousing: Design a data warehousing system for e-commerce. Discuss storage tech, data modeling, and querying methods.

🧩 Data Partitioning & Sharding: How to improve performance and scalability by partitioning and sharding a large database? Discuss trade-offs.

📦 Data Serialization Formats: Compare JSON, Avro, Parquet, and ORC. When to use each in data processing?

📜 Data Compression: Discuss data compression techniques in big data systems and choosing the right algorithm.

🌐 Distributed Data Processing: Explain distributed data processing with Hadoop, Spark, or Flink, emphasizing fault tolerance and data locality.

🔄 Data ETL: Design an ETL process to migrate data from a relational DB to a data lake. Discuss tools and frameworks.

⚙ Resource Configuration: Handling 100GBs of data per spark-submit - How to configure the cluster?

🧐 Data Quality & Validation: Ensuring data quality and validation in data pipelines, handling missing or erroneous data.

🔒 Data Security: Best practices for securing sensitive data in big data environments - encryption, access control, auditing.

📈 Scalability: Scaling data systems horizontally and vertically to meet growing data volumes and workloads.

📊 Monitoring & Logging: Importance of monitoring and logging in data systems, tools, and metrics for system health.

🗃 Data Archiving & Retention: Data archiving and retention strategy for a data warehouse, handling historical data.

💰 Cost Optimization: Strategies to optimize data storage and processing costs in cloud-based data architectures (AWS, Azure, GCP).

📜 Data Governance: Role of data governance in data engineering - ensuring compliance with data regulations.

ankit_in_munich
Автор

As an SWE preparing for DE interview, I found the mock interview and resources extremely helpful. Thank you!

yao
Автор

Great video. This really explains how to solve system design questions. Also, thanks for sharing the resources 😃

jaladhithakur
Автор

hey, really cleared my concepts. thanks bro, keep posting such videos.

salmansayyad
Автор

That was excellent!
Well structured, and well explained. Even my undergraduate self was able to understand quite a bit of it :)

snaekboi
Автор

In an interview i was asked to design pipeline to get data from api and store in db for dashboarding. Each region has its own data and all data should be collected centrally. My answer was to put data in kafka and from there use airflow or flink to process and store in data warehouse for each region. And all regions incremental data to be moved to central data warehouse. Then interviewers next question was to write code of airflow component. I said rn i need to use google to write code as i don't like to remember syntax and function names but i know what logic should be used and all. That's how I got rejected in last round.😢

shubhambhandari
Автор

this was so helpful and somehow reassuring, I really appreciate all your work and effort! Keep it up (:

larissaarreola
Автор

Thanks jash for creating the video. Really helpful.

priyankapandey
Автор

Excellent Video !!
Although, I have a question. What would have changed in the data pipeline if the source would have been a streaming source ? Where should we put the Kafka/PubSub in the data pipeline ?

sirajansari
Автор

this is amazing please do more system design vids like this for data engineers where you go in depth like this

godfrey
Автор

Can you share the detailed list of topics from basic to advanced for system design for data engineers

abhinavpandey
Автор

Great video on system design. Just wanted know to what tool/software do you use for creating data flow diagram during interview.

puneetnaik
Автор

Thanks Jash, Great content! I guess it is implicit in your design that EU data will only be in EU database and US data will only be in US database. Is that right? What about certain global application data, would it be a challenge to sync it in both regions?

nibu
Автор

any good book to learn system design for data engineering & not SDE.

ajrze
Автор

Appreciate the effort, the content is very clear and presented in a way that it is easy to absorb.
I would love to know if beginners can actually configure and test these systems, even if to a limited extent, without having to pay for any cloud service?
If not, what would be the next best thing once can do to learn these concepts practically?

hamidomar
Автор

Is software engineer system design are same as data engineering system design?

pradhyumansinghmandloi
Автор

Excellent Information.
Thank you for sharing..

artofheart
Автор

Crisp and concise. Super helpful.

Is this entry level question? Does L5 level have same depth? And is mostly for India based companies or US based?

Will Data Engineer have more of the second example type System Design? Or is it same as SDE?

What is a good site to practice DE specific system designs?

Thank you!

nobodyinparticula