Google SWE teaches systems design | EP35: MongoDB

preview_player
Показать описание
Is M for Mini, or for Mongo?

Recommended Reading:
Рекомендации по теме
Комментарии
Автор

The viewers should really pay attention to the names Jordan is using in his videos

chengli
Автор

Hey Jordan,
Great Video.
Quick question, why do you think reads are slow in Cassandra when compared to Mongo ?
When LSM trees are combined with Bloom filters then reads are as good as Mongo, right?

JardaniJovonovich
Автор

HI Jordan, I am really stuck in the following question and need your help to make database choice. Please help 🙏

I want to make a database choice for a system design question as follows : Design a scraper service that takes as input an array of urls of web pages. For each url, save every image link in the html document. Any other links on the page, follow them and recursively get the images on those pages too. eg lets say url1 has image1 and follow url as url 2 and url 2 has images link "image2 and image3" and follow url : url3.

Which means while we are scrapping the url, we get 1 level data like

url {
url : "url1",
followUrl {
url2
}, image {
image1
}
}

and other result for url2 which has url3 and "image 2 and 3".

Now the task is to generate report :

Our Query access pattern is we enter a url let's say url1 and want to get report as :

url {
"url" : url1
image : image1
url : {
"url" : url2,
"image " : {
image2, image3
},
url {
"url" : "url3"
}
}

if we enter url 2, we get subdocument of it. The pages on web are static and once a job is done, results stored in db can be used by other users also when they want to get the report of those or sub-urls

I was thinking of using mongodb as I could store the parent URL as 1 attribute of document and "images and following urls" as 2nd attribute and third attribute.


My proposal is using mongoDB due to data locality. And not SQL due to over the network join.

utkarshgupta
Автор

Hi,
In system design what are the use cases which perfectly fits Mongo DB? If not Mongo DB what would be the ideal nosql DB for storing a social media post related information?

AkshayDhillon
Автор

Hey Jordan, pls can you explain how the lack of normalized data in Mongo requires having to change data in multiple places?

tce
Автор

Just thinking out loud: Mongo uses B-Tree and Single Leader Replication (SLR). I assume Postgres or MySQL/Oracle use SLR as well right? If Mongo has all these, why is it BASE as opposed to ACID? Is it because its a diff engine?

tce