Implementing Vertical Sharding

Показать описание

Build Your Own Redis / DNS / BitTorrent / SQLite - with CodeCrafters.

In the video, I explained the importance of sharding in scaling databases, focusing on vertical sharding where tables are distributed across multiple servers. I discussed the transition from monolithic to microservices architecture and how vertical sharding helps in this shift. I detailed the implementation steps of moving tables between database servers, emphasizing the use of tools like Zookeeper for storing meta information and ensuring reactive updates across API servers. The process involved dumping tables, loading them into new databases, setting up replications, and performing a seamless cutover for data consistency.

# Recommended videos and playlists

If you liked this video, you will find the following videos and playlists helpful

# Things you will find amusing

# Other socials

I keep writing and sharing my practical experience and learnings every day, so if you resonate then follow along. I keep it no fluff.

Thank you for watching and supporting! it means a ton.

I am on a mission to bring out the best engineering stories from around the world and make you all fall in
love with engineering. If you resonate with this then follow along, I always keep it no-fluff.

Рекомендации по теме

Комментарии

I have hit my head so many times understanding the sharding, but could not get it so well. But this guy never fail to explain the things in so so easy way. I am lucky that I found this channel on YT.

ddevotion

binge watching after a break ! So much quality content over the last week. Thanks a lot

adianimesh

Maybe a dumb question but here I go

If the table renaming step ( table to table.bak) is done after firing zookeeper update, can’t this be more helpful in reducing the small database down time ( assuming zookeeper updates happen immediately without consistency issues )

Saying this because, if we do this, the second DB server is anyways up and will take requests and renaming can happen later … this will also ensures that the replication is completely done

vasusharma

There are actually 2 ways to do Vertical Sharding:
1) The one which you mentioned where there is no downtime, it's real time
2) The one where there is downtime and T2 is dumped on DB2 during downtime so as to avoid replication and further steps

You mentioned in video at last that large tables are not sharded as the replication process for them to get in sync is difficult, for such tables we can use downtime approach. What if you come up to a situation where DB1 shard has 2/3 almost same size tables and all are hot? You can take a downtime and do vertical Sharding.

Also for financial domain companies, this realtime vertical sharding with 10-15ms is not acceptable, they also might be using downtime approach!

jaisamtani

how is vertical sharding different from partitioning?

cnp

Really grateful to God that I found this valuable gem like content on time♥️
Never worry about views, because some gem music videos are also hidden and craps are getting 1B views

homestaysandcafes

Arpit bhai you just rocked it, ek dum top level content with no clutter even for 1 second.
This video filled me up with confidence that I can handle database in production.
Started binge watching your channel.
Keep making content Sir.
One thought that hit me while watching this video : This type of content will catalyse the transition of India from being IT services hub to IT manufacturing hub😄

mukeshmahadev

Thanks for an amazing video - may i know the difference between federation and vertical sharding?

ramyakrishnan

Designing cannot be anymore intersting and easy when compared to your videos. Just keep those videos comming everyday !! I've already took out a slot from my day to watch your videos.

Aditya-usgj

Great content Arpit. Very easy to understand,
small doubt - Why to rename the table first then go for zookeeper config change? Why not first update the config in zookeeper then take back up the table. This will reduce the availability loss.

ashishtewari

One doubt. While copying bin log to shard 2 we record last time stamp or id till which we copied right? After that copy is done, we start replication right?

CDC or replication service catches up the shard 2 with live updates

What about the updates that are happening to db while we are copying the bin log?

arunrahullakkapragada

One of the best content on YT for Vertical Sharding, looking forward for Horizontal Sharding video. :). One question related to joining between 2 database tables, do we need to use host level join in case?

vikas

Arpit, as always amazing video, thanks for sharing. I was actually planning to join your Feb cohort but couldn't enroll before the registrations got closed.

I have some thoughts on this video, maybe I am missing something. I feel migration of table t1 from 1 db server to another with this approach is kind of over-engineering. I have done migration in my previous company, let me explain my approach.

1. We don't need to store the metadata about which db server the table belongs to in zookeeper or any service discovery. Generally in each app server we have our DB configurations file (yaml, xml), we can add and maintain both the DB configs in that. And app server connects to both.
2. The cutover can happen gradually with dual writes to the table in both the DB servers (simple code change). And historic data can be migrated by the db table snapshot.
3. The final cutover can be done by maintaining a config in a remote config, which is basically WIREON/WIREOFF (WOWO) configuration, i.e. turnoff the writes to the previous db server table (example: disable.writes.to.xyz := true)

Let me know your thoughts..

aniruddhkhera

Appreciate the detailed breakdown! 🧐 I wanted to ask something unrelated: 🤔 I have these words 🤨. (behave today finger ski upon boy assault summer exhaust beauty stereo over). How do I use this? 🤨

GieGarczynski

Thanks Arpit, Allah bless you. Top notch level content.
Have one query :
If I have a large DB/Table for which I have indexing on some columns as well. While partitioning my data, do my indexing also got partition or I have to do manual indexing on my data partition on it's restored on different DB instance.

Polly

Great content. Didn't feel like skipping even for a sec. Kudos!! Also, one question - How do we go about migrating huge table from one database server to another? By huge table I am assuming that the table size is big and also there are huge number of writes happening.

ujjwalsaini

Just connecting the dots, is this the same way how we scale blob storage (S3/Azure) when data node in a bucket gets hot? :)

shivamsrivastava

What about the failed requests when we renamed the table ie `Table Not Found` part. Read requests are acceptable but those write requests will be lost, wont be that a consistency issue?

chiragrajani

Hey Arpit, Great Video . Have a doubt here
You mentioned zookeeper watch will inform the API server when there is a change, but where will the API Server store this config information ?
If API server is not storing the config information then with every request we need to hit the zookeeper first to get the config, which will surely add latency.

sayantankundu

How the watch updates config of API server? Are they connected on websocket? Dont think so generally. Please explain.

rahulsarkar

Implementing Vertical Sharding

Implementing Vertical Sharding

Sharding | Horizontal Scaling | System Design Interview Basics

What is Database Sharding?

Mongo DB Sharding Basics in 5 Minutes

Database Sharding in 200 Seconds

Learn Vitess - Vertical Sharding (part 4)

What is sharding of Database?

Before Sharding Your Database Consider This #shorts

What is Database Sharding?

Data Sharding (Vertical and Horizontal)

What is DATABASE SHARDING?

Database Sharding - Postgres example

The Basics of Database Sharding and Partitioning in System Design

MongoDB Replication & Sharding: Scaling Your Database for Performance

InfluxDB IOx: Impacts of Sharding, Partitioning, Encoding & Sorting on Distributed Query Perform...

Manual sharding vs Automatic sharding

Databases: Scaling, Sharding and Querying - Jinal Parikh - Confinement Session

Key Based Sharding

Practical steps for Vertical Database Sharding

Google SWE teaches systems design | EP5: Database sharding/partitioning

Important sharding techniques.

How to Use the new JDBC 4.3 Sharding API for Massive OLTP Scaling

Database Sharding and Partitioning

Database Sharding Explained in 1 Min | What is Database Sharding | Data Sharding Techniques #shorts