The effect of Random UUID on database performance

Показать описание

In this video I whiteboard how UUIDs hurt write (and read performance) when used on secondary and primary indexes. UUIDv4 are the most popular but they are truly random. Compared to snowflakes, ULID or UUIDv7 or even UUIDv1.

0:00 UUIDv4
2:30 B+Tree Indexes and UUID
5:30 Random UUIDv4 Insert Workload
12:40 Ordered Insert Workload (UUID7/ULID, Sequence)
14:00 Shared buffer pool flushes
15:00 Shopify ULID use case
17:00 URL shortner UUIDs?

Discovering Backend Bottlenecks: Unlocking Peak Performance

Fundamentals of Backend Engineering Design patterns udemy course (link redirects to udemy with coupon)

Fundamentals of Networking for Effective Backends udemy course (link redirects to udemy with coupon)

Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon)

Follow me on Medium

Introduction to NGINX (link redirects to udemy with coupon)

Python on the Backend (link redirects to udemy with coupon)

Become a Member on YouTube

Buy me a coffee if you liked this

Arabic Software Engineering Channel

🔥 Members Only Content

🏭 Backend Engineering Videos in Order

💾 Database Engineering Videos

🎙️Listen to the Backend Engineering Podcast

Gears and tools used on the Channel (affiliates)

🖼️ Slides and Thumbnail Design
Canva

Stay Awesome,
Hussein

Рекомендации по теме

Комментарии

The whole time I was wonder why not just insert on a new index rather than pointing to a UUID, that way it's always ordered. But then you said MySQL defaults to this at the end of the video. That's critical for the 'why' of this video

AffyisAffy

Actually, I was thinking about that question a couple of days ago
Thank you :)

ahmedalaaeldin

Yes when you age them out, they might leave gaps in the index tree but also reuse the gaps regularly. They might not “nowhere near each other” but always in between. Sp with larger pages the splits are less likely. (Of course IOT is to a good usecase)

berndeckenfels

Ive used UUID generated outside the database and stored as primary key varchar within the db. Its worked a lot better for over 100 million devices that I was working with. Primary goal was to use it as hashtables when use other systems like redis/dynamo etc. For lookups, with a bit off magic in UUID generation which is not being generated by DB, you can apply Bloom filters as well. I've been out of engineering field for a decade+ but these sort of videos are always much needed to discuss the fundamentals of how things work.

RealEvangelizer

This one cracked me up 09:59 😂

Great work as usual 👏
Please consider making a video on Pinecone and vector databases 🙏

gaml

Awesome! Thanks. Dialectic of randomness and order. Very beautiful.

vasiliynet

Never thought, Hussein will come with a pun in the video "that's what she said"

tempaccount

The silent joke at 10:00 That's what she said. Haha, classic.

AbhishekSingh-pudg

MySQL 8 uses UUID v1 (kind of compatible with UUID v2).

By using the UUID_TO_BIN()/BIN_TO_UUID() function with the optional second argument 'swap_flag' it will reformat the UUID before converting it to a BINARY(16). This will make them sequential (since UUID v1/v2 is based on the timestamp).

ddanielsandberg

10:00 would have never expected that joke from you 😂

RZhuAmpere

me usually used index "order by timestamp", thats why UUID (in this case, just for unique identity) not a problem.

Maman-Setrum

lowkey "That's what she said" reference at 9:57 😂

anuragbhagsain

Please do a video on vector databases.

mikestaub

The worst is using UUID with SQL Server; since the PK is Clustered; performance becomes awful very quickly.
I use Postgresql and never had performance issues with UUIDs.

OzoneGrif

What about distributed NO SQL DBs (document, key-value, etc) where the recommendation is to use partition keys with high cardinality and avoid sequential values because that will create “hot” partitions for inserts at least? Maybe its a different use case but it would be an interesting topic too

alegon

Will this concept be applied to columns that are Varchar and indexed? Because strings are also random like username, email, url slugs, etc.. If so, what is the workaround for storing string values as indexed?

usamaabubakar

I just randomly stumbled upon your video, and I've always had this itch about UUID performance. Your video really boosted my confidence in what I was thinking. Thanks a lot!

Oh, and quick question: Do databases like MariaDB, MySQL, or PostgreSQL automatically play nice with ULIDs? And how do they know to sort 'em out for indexing?

therealtuyen

Pro tip: watch at 1.5x speed. You're welcome

redpillsatori

His videos are so interesting but so slow. I can easily watch it at 2x speed.

shmmh

If your IDs are public and you make them sequential, be aware that third parties will know the number of entities created over a time period.

saggitt

The effect of Random UUID on database performance

The effect of Random UUID on database performance

Stop Using The uuid Library In JavaScript

Better Than GUID - Have You Heard Of ULID Before? #shorts

ULID vs UUID: Which One Should You Use?

Stop Using UUID!

Which is better? UUID or Int?

UUID Java: How to Generate Unique Identifiers in Java #uuid #java #shorts

UUID vs INT: What’s Better For Your Primary Key?

Understanding UUID in 7 minutes

Generate a Pseudo Random UUID in JS

What is a UUID? UUID vs. GUID

Can UUIDs collide with each other and how likely is that?

The Every UUID Website Explained

UUIDs are Bad for Performance in MySQL - Is Postgres better? Let us Discuss

Database Auto increment vs UUID - Which is Right for You?

Java :How good is Java's UUID.randomUUID?(5solution)

ULID: Universally Unique Lexicographically Sortable Identifier a shorter UUID v7 alternative

Why UUID is unique?

The WORST Programming Languages EVER #shorts

UUID | Postgres.FM 051 | LIVE episode

UUID: Universally Unique Identifiers v4 and v7 design

5mins of Postgres E59: UUIDs vs Serial for Primary Keys - what's the right choice?

Generate Unique IDs in Python (UUIDs)

Best random and unique ID generator for Go #nanoid #ksuid #uuid #randomId #uniqueid