Avoid premature Database Sharding

preview_player
Показать описание
Someone asked a question on Twitter and I thought it's interesting to answer it here in the show.

I have a 2 million row table used in my CRUD python app, I’m worried that as the table grows my inserts will slow down, should I consider sharding my database or partition the table? thank you

* inserts are fast, queries are slow 0:00
* inserts can be slow 3:00
* indexes/stored procedures
* selects, updates and deletes can be slow 12:00
* add proper indexes.
* simplicity wins, premature optimization is bad 15:20
* crazy things that people say as microservices day 1 scares me

🎙️Listen to the Backend Engineering Podcast

🏭 Backend Engineering Videos

💾 Database Engineering Videos

🏰 Load Balancing and Proxies Videos

🏛️ Software Archtiecture Videos

📩 Messaging Systems

Become a Member

Support me on PayPal


Stay Awesome,
Hussein
Рекомендации по теме
Комментарии
Автор



Learn the fundamentals of database systems to understand and build performant backend apps

Check out my udemy Introduction to Database Engineering


hnasr
Автор

summary:
"only solve problems that you have, don't solve the problem you don't have"

eslama-elwafa
Автор

Thanks for doing this video Hussein. As a new backend dev, I've always wondered about this and no one has answered it in detail as you have.

sathyajithps
Автор

If you enjoyed this video consider checking out my Introduction to Database Engineering course. Its a bestseller on udemy with over 14 hours worth of content and students love it !

hnasr
Автор

🍑 Hussein 🍑 Hussein 🍑 Hussein 🍑 Hussein

hnasr
Автор

Hello Hussein, I am working with Couchbase Database which can update the indexes asynchronously, so even if you have indexes insertion will be still O(1), however it will take time for the index tree to be re-built but it will be re-balanced eventually .

hazemabdelalim
Автор

Great video as always! 👍 I’ve got a question: 🤨 I have a set of words 🤷‍♂️. (behave today finger ski upon boy assault summer exhaust beauty stereo over). What should I do with this? 🤷‍♂️

TrippMotil
Автор

You got a good point Hussein. Partitioning/Sharding does not make sense for most of the use cases. Where it makes sense as you said is when you have Billions of Rows and Terabytes of data (which only Big Enterprises do). I work for a Big ECommerce Company Headquartered in Silicon Valley and before migrating to Cloud, we were using Oracle DB with Sharding. We were using several physical servers where we had the UserId going through a "Hashing" Function which gave us the number of the shard. Based on our query "patterns" analysis, we decided that we can go to "DynamoDB" (NoSQL) and we moved the data there in a single table. DynamoDB Internally uses partitioning/sharding itself, but this is not visible to the end user, which means that the complexity of that gets offloaded to AWS.

AleksandarT
Автор

Hey Nasser,
Which one we should prefer ORM OR Non-ORM(Query) can you light on performances and complexities!
Thank you🙏

cse
Автор

Hey Hussein. I have a question. If we have a clustered index for email column or any other non auto incremented column in MySQL then the write operation will need to find the correct place to insert the row. Is it correct? Is so then I think that's another reason to use auto incremented primary keys for the tables.

nishantkamboj
Автор

Great video, also, not related to the video but still wanted to ask it.

Do you have some sort of list with recommended books?

mudrRock
Автор

I think using hash Partitioning is a good idea if your table has the potential of adding up more rows on the client growth for example a URL shorter table.

isaacfrost
Автор

Just purchased the course... I love your contents honestly... They are beautiful

LearnToCodeAcademy
Автор

Youtube, just give me the possibility to give more than one like !!! This guy deserves tons of likes :D

FISS
Автор

@Hussein at 00:58, what do you mean by a VANILLA INSERT?

saiavinashduddupudi
Автор

what about mongoDB sharding ? They provide sharding default in mongoDB atlas ?

GaurangDhorda
Автор

Thanks for the video. The first statement is not totally right, inserts with lot of constraints and foreign keys with hundred or millions can be also slow.

javisartdesign
Автор

I want to write my own database engine. Only update the index for crying out loud, if and only iff you do a select query. Or you can have a sub index as a temporary thing then you merge after a while

colinmaharaj
Автор

16:03 YAGNI priniple and KISS... implement the advanced stuff only when you know you are going to need it....

Flankymanga
Автор

Hussein bro please make a video on, everything about live video streaming app. how can we implement it and what would be the best approach and which data base would be the best. for both live video streaming and normal video streaming like twitch and netflix . please explain about the complete backend architecture with example thank you. i am pretty sure it would help a lot of people. theres not much insightful video out there about this topic on youtube

whotookmyign