A Deep Dive in How Slow SELECT * is

preview_player
Показать описание
Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon)

In a row-store database engine, rows are stored in units called pages. Each page has a fixed header and contains multiple rows, with each row having a record header followed by its respective columns. When the database fetches a page and places it in the shared buffer pool, we gain access to all rows and columns within that page. So, the question arises: if we have all the columns readily available in memory, why would SELECT * be slow and costly? Is it really as slow as people claim it to be? And if so why is it so? In this post, we will explore these questions and more.

0:00 Intro
1:49 Database Page Layout
5:00 How SELECT Works
10:49 No Index-Only Scans
18:00 Deserialization Cost
21:00 Not All Columns are Inline
28:00 Network Cost
36:00 Client Deserialization

Fundamentals of Backend Engineering Design patterns udemy course (link redirects to udemy with coupon)

Fundamentals of Networking for Effective Backends udemy course (link redirects to udemy with coupon)

Follow me on Medium

Introduction to NGINX (link redirects to udemy with coupon)

Python on the Backend (link redirects to udemy with coupon)

Become a Member on YouTube

Buy me a coffee if you liked this

Arabic Software Engineering Channel

🔥 Members Only Content


🏭 Backend Engineering Videos in Order

💾 Database Engineering Videos

🎙️Listen to the Backend Engineering Podcast

Gears and tools used on the Channel (affiliates)

🖼️ Slides and Thumbnail Design
Canva


Stay Awesome,
Hussein
Рекомендации по теме
Комментарии
Автор

Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon)

hnasr
Автор

What would be nice is to have some numbers to see the real impact on performance and to know which of this factors are more crucial

Rettou
Автор

Bigger problem than it being slow is that it could cause issues with your code when you rollback failed deployments after db migrations because db migrations could add new columns that your older version of the code does not recognise and fail to scan.

dyto
Автор

35 seconds in. Noticed the background is less noisy - love that it's just three books. And the lighting is softer. This is nice. I thought the sword was gone till I googled the word Musashi and realized the spirit of the Samurai is still very much here. This is nice too.
Back to SELECT * . Hopefully this is the video where it finally sinks in for me what your *fetish* for "SELECT *" is. I'm a big fan, Hussein, I can say you played a big part in my having the career and approach to learning that I have now. But as anybody who's watched numerous videos you've made on databases would notice, you have a thing for SELECT * :)

shiewhun
Автор

Kids these day will not call you cringe. They will call you 'an amazing person who teaches us so much and gets us interested in learning the fundamentals'.

HarshKapadia
Автор

But if you actually need everything, SELECT * is not slower than selecting every column explicitly, right?
So SELECT * isn't slow. Selecting in general is slow.

dasten
Автор

Thanks you for providing such high quality vids 🙏

bgvphnw
Автор

thank you for sharing these valuable info

bashardlaleh
Автор

Hi Hussein, what you think about ORM? is it worth to used? the pros and cons? any video about it? I would love to watch it thanks :))

eulerpi
Автор

Thank you very much sir. This is a very detailed explanation.

pnworks
Автор

Hussein you are such an inspiration ❤️

codinggavin
Автор

Great video sir, you teach the good stuff

dariusdoku
Автор

Would love to have the video reg File block and byte. How Read operation is done and its underlying logic

ayushpandey
Автор

Hey Hussein, I have one question-
How is select * different from select col1, col2, col3, col4 where id =1? - say I have 100 columns in my DB or some very high number of columns
At the end it has to still figure out the page where col1, col2, col3, col4 are, right ?
The only part that would be less is- deserialisation cost, n/w cost . But the main part of searching the remaining the columns from heap still lies there even if we are selecting few columns as opposed to *.
Is my understanding correct ?
can you shed some light ?

okfxklf
Автор

I'd love to if you can review Citus postgres distribution

RahulAhire
Автор

"A page is.. and a block.. is the most overloaded term in SE." 😅

_dnL
Автор

Hi Hussein, what would be the impact of using an guid as primary key column vs using an autoincrememt as a primary key column?

hemant_pande
Автор

Question: if I do “select *” and in the where clause I put a indexed column - what’s even the benefit of using the index if eventually the db will have to do a table scan?

mtnrabi
Автор

I listen to your shows when going for a walk and damn so much info i gain during that 40 mins.

apoorvgupta
Автор

Hi Hussein, Even the queries selecting few columns like select a, b, c from t1 where grade>90 still needs to fetch the pages from Disk Randomly
how is it greatly different from select *

sriteja