Sorting Algorithms: Speed Is Found In The Minds of People - Andrei Alexandrescu - CppCon 2019

preview_player
Показать описание

Sorting Algorithms: Speed Is Found In The Minds of People

In all likelihood, sorting is one of the most researched classes of algorithms. It is a fundamental task in Computer Science, both on its own and as a step in other algorithms. Efficient algorithms for sorting and searching are now taught in core undergraduate classes. Are they at their best, or is there more blood to squeeze from that stone? This talk will explore a few less known – but more allegro! – variants of classic sorting algorithms. And as they say, the road matters more than the destination. Along the way, we'll encounter many wondrous surprises and we'll learn how to cope with the puzzling behavior of modern complex architectures.

Andrei Alexandrescu

Andrei Alexandrescu is a researcher, software engineer, and author. He wrote three best-selling books on programming (Modern C++ Design, C++ Coding Standards, and The D Programming Language) and numerous articles and papers on wide-ranging topics from programming to language design to Machine Learning to Natural Language Processing. Andrei holds a PhD in Computer Science from the University of Washington and a BSc in Electrical Engineering from University "Politehnica" Bucharest. He is the Vice President of the D Language Foundation.


*-----*
*-----*
Рекомендации по теме
Комментарии
Автор

Excellent talk. There is so many unsolved problems in sorting (especially adaptive sorts, parallel and distributed sorting, in place merge-style sorting, etc). People often overlook actual complexity (both theoretical and practical) of sorting. Many years ago I had an interesting sorting requirements, and well, I found out that despite research for 60 years and a lot of progress there is no known optimal solution, nor a well working scheme that works good on average to this day.

movaxh
Автор

Amazing talk with important conclusions and funny jokes from Andrei. Thank a lot!

oleksiimomot
Автор

Andrei is one of the original pall-bearers of c++. Always enjoy his talks. He is creative & unique in his thinking. I wish cppcon would become smaller and increase its quality like it's golden age in +-2014.

fritsgerms
Автор

Must-watch presentation, as always with A.A.

CedricMialaret
Автор

Best CPP speaker
And I am not even CPP developer!

dengan
Автор

Fantastic talk. Andrei is such a great speaker, and the content was very interesting. I'm not sure how to really improve our code in the ways Alexei highlights, but it certainly seems to be something that we should be talking about.

andrewzmorris
Автор

42:20 That goto comment is one of the funniest things I have ever seen in programming!

bettkitty
Автор

_Design by Introspection_ is a sub-set of **Data Orientated Design.**

There are 3 types of optimizations (from slowest to fastest):
1. Bit Twiddling
2. Algorithmic
3. Data orientated (cache usage and branch prediction)

The FIRST rule for optimization is:

**KNOW THY DATA**

MichaelPohoreski
Автор

selecting an algorithm though specifics about types I believe is compatible/part of what Stepanov envisioned. The difference between a Concept and an Algebraic Object is the operations of a concept come with a cost

greenfloatingtoad
Автор

25:50 "I am almost sorry there is no rotation there"

LOL

JordiMontesSanabria
Автор

that was fantastic! brilliant. add that with Carbon and we're rocking....

chidude
Автор

I think, it may be interesting to try use a different intermediate structure, than the heap. If I understood it correctly, the key reason, why constructing a heap improves the performance is that a heap is in some sense already partially sorted and so, the linear searches end up being shorter or more uniform, improving both locality and maybe even branch prediction.

I think, that there may exist some data structure, that would be simpler to construct than the heap (maybe in terms of big-O or maybe in terms of performance), but would have the same or similar effects for the following linear insertion sort.

After all, as Alexandrescu himself said "the heap is a delicate fine-tuned structure", so maybe constructing a heap is actually overkill. After all, we are not actually using all the invariants the heap provides. So maybe we can get away with using some structure with weaker invariants.

ruroruro
Автор

I have Andrei's Book on D language--an excellent and insightful work.

evikone
Автор

Guy from Somebody feed Phil, moonlights as a C++ guru. Amazing talk.

srh
Автор

I have a problem with this talk: it presumes a lot.

It presumes branch prediction, caches of certain depth, speculative execution, and who knows what else. Sure: modern desktop CPUs have it, as do the large ARMs. Now take this code and start moving it backwards in history. Say, to Raspberry Pi A. To a P-II. To a 486. To a 386. To AVR. Heck, rewrite it in Turbo Pascal for CP/M. I’ve tried: it all falls apart as you pass ARMv7. And the simpler the architecture, the worse the regression.

The standard library is supposed to be as universal as possible. It may run on varied architectures. Yet here we micro-optimize in ways that make things much worse as soon as you’re outside the realm of the most modern CPU. These improvements make the sort worse than 2x as slow on a 486, for example. Sure, it’s not exactly a common target. But Arduino supports C++20, and if we try this trick there, we get same performance regressions. So, before anyone sticks this code into any standard libraries, they better examine the supported platforms and make it optional, and enabled only on modern architectures.

I also have a problem with sweeping statements like “they don’t teach it in books”. CS introductory texts all presume an uncached in-order architecture where every memory access and comparison has equal cost. We all know this, even if the authors fail miserably at putting this assumption in big bold letters at the start of every chapter. But the standard analysis of algorithms doesn’t somehow fail here: as soon as you make the costs take into account caching and branch prediction, you get somewhat more complex equations, but you recover the approximation of computed costs to costs of running on real hardware. So, given that we all understand what assumptions the classic CS texts make, it’s disingenuous to say “they don’t teach it in the books”. They don’t because even with the simplified logical architecture the texts presume, it’s still hard enough. And there are plenty of theoretical algorithm treatments that do in fact take architectural details into account.

Also, it’s not as if the CS classics are completely detached from modernity. There’s this thing that was all the rage back when cores were too small to fit all the business data. Remember tape sorting? It takes excellent advantage of memory caching and prefetch, and you start seeing these advantages as soon as you “upgrade” from 386 to 486.

Also, this general admonition against branches but in favor of data-integrated condition tests: all is fine and dandy if you don’t introduce data dependencies. Not a thing back on a 486, but these days short-acting data dependencies deparallelize the CPU, and the worse they get the more the billion transistor CPU starts to work like fast but “dumb” 486, with 1000x less transistors…

This talk is a wonderful show and it is full of technical insights, but this is high-grade engineering entertainment, not an engineering project report. Don’t try to apply these things without having a clear understanding of what’s involved but unsaid, and for crying out loud: repeat all the benchmarks in your target application, with your target data.

absurdengineering
Автор

How on earth do you write unguarded_insertion_sort for a generic type? It's one thing for std::sort to return elements in an arbitrary order if a programmer overlooks that their less-than predicate doesn't define a total order. It's totally another for the insertion operation to buffer underflow and start writing to arbitrary locations.

Actually, I don't even know that you can do it for doubles. At least, his code has a bug: at 29:22, if the element at *begin is NaN, then make_heap will leave it there because greater<>() is false for all comparisons with that element. Then unguarded_insertion_sort will underflow below that element because all less<>() comparisons will be false as well. IEEE764 floating point doesn't define a total order so you can't use the default comparison operators in the way Andrei does.

OMGclueless
Автор

Thank you sir is the best talk ini learning education from engineer 👍👍

sakuraikamoto
Автор

22:50 That comment on the undergrads in india was hilarious. Quite true but still... hahahaha

Prashanth
Автор

Slide at 26:56 is wrong.
It should be +1, end);
Because the 2nd smallest element can be at either begin +1 or begin +2 and it is still a valid heap.

stephenhowe
Автор

36:12 I went the other way. I thought you'd add an arbitrarily large element (larger than any element in the list) to the list before heapify and then pop after sort.
39:45 Infinite loops FTW

joseville