How Fast is Python's Sort? Performance Testing

preview_player
Показать описание
Performance testing Python's builtin sort.

Learning to performance test is an important part of transitioning from a beginner Python level to an intermediate to advanced Python level. Measure measure measure! Get coding!

SUPPORT ME ⭐
---------------------------------------------------

BE ACTIVE IN MY COMMUNITY 😄
---------------------------------------------------
Рекомендации по теме
Комментарии
Автор

Came for TimSort performance, stayed for the general performance testing knowledge

oddlang
Автор

Pythons build in sort is btw called TimSort. Its a merge sort and insertion sort hybrid

janknoblich
Автор

I found your results very insightful and inspiring.
Didn't think about performance testing that way.
Your video is unique.
I would be interested in seeing more performance testing stuff

JohnZakaria
Автор

Re: other things running at the same time, you can at least mitigate scheduling differences by using `process_time_ns()` instead of `perf_counter_ns()`. That'll let you compare the user+system time (i.e. the amount of time actually spent executing the particular process and its kernel calls, with pauses in which other processes were scheduled not included) instead of comparing wall time (i.e. the length of the total real-world interval between when the 'start timer' bit got scheduled and when the 'stop timer' bit got scheduled). That doesn't eliminate _all_ variation (e.g. you still waste process time after a context switch on branch mispredictions and cache misses and page faults and stuff), but it'd at least let you ignore the _bulk_ of the CPU time allocated to other processes.

One other trick: call `os.sched_yield()` right before you capture the start time. This will give the kernel the opportunity to get other pending processes out of the way, that it might otherwise decide to schedule during the first millisecond of your test run, and ensure that your test case starts right at the beginning of a jiffy (and therefore have an entire jiffy to run before the next scheduler interrupt, meaning that test cases shorter than 1ms likely won't get interrupted at all.)

AJMansfield
Автор

I'm a python beginner and your videos help me out a great deal, they're really fascinating and cool

yazanalj
Автор

Didn't know performance testing went this deep . . . another rabbit hole to explore!

stilgarfifrawi
Автор

I am more interested in knowing why exactly the second sort is slower than the first one and is there a way around it?

PrashantKg
Автор

Cool video. What about using `__slots__` on the classes to be sorted? Should at least beat out the regular class because of the optimization in class attribute lookup, but I wonder how close it comes to the tuple sorting.

psy
Автор

This channel makes me feel like a genius

player
Автор

thanks for a very interesting analysis

TheYellowBlueWhite
Автор

"nearby electromagnetic fields" got me.
Ah yes, thinking about electromagnetic fields when sorting, who doesn't?!

connorclub
Автор

Hi, thanks for the videos. What plotting library are you using? I like that it looks interactive

javiercmh
Автор

What I found surprising is that in the end result, sort appears to perform better on the GeometricString cases than on the UniformInt cases. After all, string comparison is usually/intuitively slower than integer comparison.

Then I figures since you're restricting the characters of the strings to the printable characters, it is similar to n-ary FewInts cases but the GeometricStrings cases outperform those too!

Perhaps my first assumption doesn't hold... Any insights on the matter?

Marquesian
Автор

Another caveats, this is micro benchmark that show max possible speed of given code but this do not match exactly same code when its run in bigger app.
Other pats of code could interfere with speed and you will not get same results as this in real environment. One of many reason is CPU cache.

Conclusion is that you start with benchmarks like this but then when you finish work you need benchmark full application too to see if your code is better than old one.

von_nobody
Автор

Is there a reason the nearly sorted ints and fully sorted random strings have those dips in elapsed time at what appears to be powers of 2? (Most notable points on both graphs are at ~256 and ~512 elements)

kiefac
Автор

Could you make a video on any substring searching algorithm performance testing? Great content 👌

modernfreeman
Автор

What's with the dips in time for the nearly sorted ints at around 256 (?) and 512 (?).

auntwilson
Автор

9:27 I didn't know that you can do that with Matplotlib!

rosgori
Автор

How much faster is python builtin than a custom quicksort? Than a linux sort?

tedarcher
Автор

Sorry, kind of irrelevant but gotta ask. It's a great theme you have. Can you share which one this is?

doganayozese