How Instagram efficiently serves HashTags ordered by count

preview_player
Показать описание

Build Your Own Redis / DNS / BitTorrent / SQLite - with CodeCrafters.

In the video, I delved into how Instagram efficiently searches for hashtags using partial indexes in PostgreSQL. Instagram relies heavily on PostgreSQL for user-created data. By utilizing partial indexes, Instagram optimized hashtag searches by indexing only popular hashtags with over 100 photos, reducing the index size significantly. This optimization technique improved query performance, showcasing the power of understanding database features like partial indexes. Applying simple first principles led to a substantial enhancement in database performance, demonstrating the importance of database knowledge in optimizing production systems.

# Recommended videos and playlists

If you liked this video, you will find the following videos and playlists helpful

# Things you will find amusing

# Other socials

I keep writing and sharing my practical experience and learnings every day, so if you resonate then follow along. I keep it no fluff.

Thank you for watching and supporting! it means a ton.

I am on a mission to bring out the best engineering stories from around the world and make you all fall in
love with engineering. If you resonate with this then follow along, I always keep it no-fluff.
Рекомендации по теме
Комментарии
Автор

Man, you are next level. Your chanel is a binge watch for engineers.

rahulsarkar
Автор

5:31 "this is the beauty of Instagram, they never optimise" - truth has been spoken 😅

xskrish
Автор

Finished ✅ ... ( Self Note ) :

New Learning:
1) It's great to see the query execution plan .. an insight can help a lot in optimization ( till now I just knew the query execution plan in theory ... How to access it was new to me )

2) partial Indexing can effectively be used in tail graph ...
The idea of partial Indexing is similar to Heavy Light Decomposition ( CP Background) ( it is used in graph ) so yeah .. cool to see a idea getting implemented in lite version and gaining Huge performance.

Now I think ... I can try to use Heavy Light Decomposition in Vector databases 🤔 (a must try 😁 thing )

dipankarkumarsingh
Автор

Hi Arpit, your approach to engineering is truly inspiring. Despite the intimidating jargon like Partial Indexes, distributed transactions, etc, your emphasis on first principles makes Computer Science feel like Common Sense. Keep up the great work and continue to inspire us. Thanks!

okcodingbeast
Автор

I’m really feeling lucky to have found your channel. Keep up the good work !

j-stormz
Автор

as always concise n effective.. love it

tawseefbhat
Автор

I AM LATE TO FIND THIS CHANNEL, BUT
THIS IS ASLI ENGINEERING CHANNEL.
THANKS ARPIT SIR FOR PROVIDING SUCH CONTENT.

rajatiitr
Автор

That's a perfect usecase for Partial Indexing! Great Video.

shantanutripathi
Автор

Got some really nice insights from the topic explained Arpit sir you are really amazing in explaining and breaking down hard things into simpler chunks

shishirchaurasiya
Автор

GEM. Thanks for such awesome and mindblowing content.

utkarshshrivastav
Автор

Thank you for providing good content ❤️👍

yadneshkhode
Автор

Hey, this course is not for beginner-friendly. The first video provided an intro but from the next video onwards we went straight into partial indexing which most people aren't aware of. If possible, could you also insert some basic videos, just to get the audience up to speed. Thanks for the amazing explanations though. You're extremely knowledgable and a great presenter.

teetanrobotics
Автор

What if we can have another boolean column which tell us whether hashtag popular or not. Default value of this column is false but Whenever hashtag count value reached to certain threshold(in this case 100) then it will update to true. What is pros and cons of this method over the partial indexing?

saifulhasan
Автор

very good explanation but as a beginner I dont know about what are indexing, partial indexing in database ... will figure it out through Google, thanks.

oshogarg
Автор

Do we have something similar to partial indexing in SQL server?

lakshjain
Автор

Bhaiya What if I want to seach a tag with media < 100 ? (Like what if its not a popular tag and is a part of long tail)

ankurbansal
Автор

Hi, instead of creating a new partial index, what if they would have fired the sql query with count >= 100 filter directly? That would also sort around 169 rows only, right? Won't that be similar in performance as with partial indexes.

AmanGupta-fbfz
Автор

hii arpit bhaiya can you please make the video on collaborative software like git automerge how they works?

atishayjain
Автор

Hi Arpit bhaiya saw this video and even read the medium article, i just had one question, the article content is 10 years old, do you still think they will be using the same optimization techniques on the hashtags services or it could have changed by now.

gaurangmittal
Автор

Why did MySQL did not implement Partial Indexing? Also, can you compare between MySQL and PostgreSQL? What is the better database of those two

adityasanthosh