System Design Interview: Design a Web Crawler w/ a Ex-Meta Staff Engineer

preview_player
Показать описание
00:00 - Intro
01:58 - The Approach
4:08 - Requirements
10:31 - System Interface & Data Flow
14:48 - High Level Design
18:20 - Deep Dives
1:04:09 - Conclusion

A step-by-step breakdown of the popular FAANG+ system design interview question, Design a Web Crawler, which is asked at top companies like Meta, Google, Amazon, Microsoft, and more.

Evan, a former Meta Staff Engineer and current co-founder of Hello Interview, walks through the problem from the perspective of an interviewer who has asked it well over 50 times.

Resources:

Checkout the previous video breakdowns:

Good luck with your upcoming interviews!
Рекомендации по теме
Комментарии
Автор

Hey everyone! Thanks for being part of our growing system design community 🙌

✓ Deep dives into core technologies and concepts
✓ Guided Practice: Work through real interview problems with instant feedback
✓ Breakdowns of the most common system design questions
✓ Fresh insights from recent candidates' actual interviews including what questions they were asked.

We're constantly adding new content based on your feedback, so keep it coming!

hello_interview
Автор

I guess the best part about this man’s job is that if his startup fails he’s gonna have an easy time clearing that staff interview

halterskelter
Автор

I had an interview last Friday (June 14) and I followed your exact steps. The question was to design the Ticketmaster. The Redis cache solution was the best. Thank you for these amazing videos

_launch_it_
Автор

by far the best System design interview content I've come across - please continue making these. you are doing an invaluable service!

vigneshraghuraman
Автор

I often don't comment for the videos. But couldn't stop commenting your video just to say "What a valuable content". Thanks a lot for all your videos!! Keep doing this..

crackITTechieTalks
Автор

Bro, pls don't stop posting this kind of contents, really loved it so far with all of your videos.
Able to relate with the kind of small impactful problems and solutions you mentioned during your videos, which indirectly impact the interviews

rupeshjha
Автор

I gave the meta interview last week only and I was able to crack it. All thanks to you brother.
The system design round went extremely well. I followed the exact same approach in all the questions and everything went really well.
Keep posting the videos, these are the best content over the internet for system design.

shoaibakhtar
Автор

Again the best System Design interview overview I ever met. Please keep doing it for us!

AlexZ-lf
Автор

Please please keep posting more! It educates so many people and you make the world better!! :) Absolutely the best system design series!

jk
Автор

the way u explained why we need to break down the main service is awsome bro, you basically covered multiple discussions of system design that actually happens in a company in 1 hous with detailed "why" and "how" is just awsome

paritoshpandey
Автор

Great content as always, thank you! Some comments about the design.

1. Concurrency within a crawler is going to bring a huge performance bonus.
2. Running an async framework for network io is much more faster than using threading.
3. We can put the retry logic within the crawler to make things simpler.
4. DNS caching looked like overengineering because DNS is already cached on multiple layers, programming language, OS, ISP and etc.
5. We're processing the html in another service but we're hashing the HTML in the crawler, that seems wrong.

omerfarukozdemir
Автор

By far the most inspiring, relevant and practical system design interview content. I found them really useful to perform strongly in my system design interviews

qwer
Автор

I've seen many videos related to system design, but your staff level knowledge shows when you are designing components! Amazing job 🥳

KiritiSai
Автор

Damn this is extremely nuanced. Some of the big-picture improvements (like adding the parsing queue) seemed kind of obvious, but then Evan would optimize it with a neat detail (e.g. including link in request so we don't have to fetch from database) that was so simple and yet hadn't occurred to me. Great series, great content, thanks so much!

TimothyZhou
Автор

I’m so glad to have found this channel. One of few system design resources that isn’t just performative but has actual substance!!

sinajafarzadeh
Автор

This is the first video of yours I watched and I loved it. Your pace is just right and you explain things well, so I didn't feel overwhelmed like I usually do when I watch systems design videos. Thank you!

TheKarateKidd
Автор

I'm watching your videos to get prepared for my interview 4 days later, I hope I'll be able to handle it :DDD, so far the best SD videos I could ever find on youtube.

alirezakhosravian
Автор

One of the first things that came to mind in the beginning of this problem is dynamic webpages. Most websites don't display the majority of their content on simple HTML. To be honest if I was interviewing a senior or above level candidate, not mentioning dynamic content early on would be seen as a red flag. I'm glad you included it at the end of your video, but I do think it is important enough to be mentioned early on.

TheKarateKidd
Автор

I usually refrain from commenting but this is by far the best explanation I can find for this problem statement.

I work at Amazon, the use of message visibility timeout for exponential backoff is exactly what we do to add a delay of 1 hour for our retryable messages. One very minor practical insight is to not use the metric approximate message receive count because it is almost always incorrect because the count goes up if a thread reads the message but doesn't process it. I used a retry count attribute while putting message in the queue and checked whether it exceeds the retry threshold.

tushargoyal
Автор

This is such a great example for any kind of data application that needs asynchronous processing! Widely applicable!

davidoh
join shbcf.ru