CppCon 2019: Marshall Clow “std::midpoint? How Hard Could it Be?”

Показать описание

—
—
—
The standards committee adopted "P0811: Well-behaved interpolation for numbers and pointers" for C++20.
It includes a new library call `std::midpoint`.
The paper says "The simple problem of computing a value between two other values is surprisingly subtle in general."

In this talk, I will explore this simple call, provide a history of the development in libc++, and show some of the pitfalls.
Undefined behavior will rear its ugly head, along with numeric representations, and the arcane C promotion rules.

Along the way, we'll talk about testing, and why writing extensive tests helps everyone.
—
Marshall Clow
C++ Alliance
Engineer
Marshall has been programming professionally for 35 yearsHe is the author of Boost.Algorithm, and has been a contributor to Boost for more than 15 years. He is the chairman of the Library working group of the C++ standard committee. He is the lead developer for libc++, the C++ standard library for LLVM.
—

*-----*
*-----*

Рекомендации по теме

Комментарии

I think an important (and pretty obscure) detail is missing when working with floats, Icite from wikipedia:

"the internal registers in the 8087 were designed to hold intermediate results in an 80-bit "extended precision" format. The 8087 automatically converts numbers to this format when loading floating-point registers from memory and also converts results back to the more conventional formats when storing the registers back into memory."

So the function actually returns an 80bit floating point that will be truncate somewhere in the future. If you are taking the result and perform some other operations, the compilier might decide to keep operating on the 80bit register and depending on compiler optimization you will get different results!

It can get worse: the same function can be inlined or not, depending on what operations will be done with the result, you don't know when the result will be truncated.

You can either add a volatile (AARRGGHHH) keyword, disable the 80bit register (AARRGHHH) or just realize the truth: there is no spoon.

ponchietto

Guy spends an hour explaining how not to write a buggy midpoint implementation.
Comments flooded with buggy midpoint implementations.

jamesflames

If any of the std algorithms uses a midpoint of iterator, it would be good to provide that since the code exists anyway. The binary_search comes to mind. If that (for example) only likes random access iterators, then having the same limitations on the midpoint function would be fine.

JohnDlugosz

Just starting this talk, and I'm sure to love it as I have enjoyed Marshall Clow's other talks. But I'm slightly hung up on the comment at 8:43 that ints and floats are “not numbers” and not “not math” and I suspect is just a slip-of-the-tongue. Yes, they absolutely are numbers and math—they're just not the ones we'd sometimes like them to be. By pretending, as one might, that ints and floats model the fields of integers (𝐙, +, -, *, /) and reals (𝐑, +, -, *, /), we make a fatal mistake. The computer types violate key axioms of a field: ints and floats are bounded, so the operations will not close over the sets of values and this is "overflow". There are other issues (including associativity and the difference between a number and representation) but I don't want to digress from the point: these are still perfectly mathematical objects: they're just not the ones that fit the convenient axioms we are used to.

TimTeatro

This proves, once again, that the devil is really in the details... Thanks for the information...

SahinKupusoglu

Very nice and interesting talk, thank you.

ggabriel

Does this hold for non-nan floats: a < b => (midpoint(a, b) == a => (a == b) || (a.next() == b)) ?

movaxh

46:50 actually it could be done in single pass without counting the elements. Just move one iterator by one and another one by two in each iteration and when the other one reaches the end the first one is in the middle.

kubixus

for floats: abs(a) < lo, instead of abs(b) < lo wasn't a mistake? It is still in the version on github.

kirepudsje

what is the difference between the cmps at 29:40 ?

YumekuiNeru

I'm not convinced that the branch is necessarily worse than the cmov in this instance, and I suspect it depends strongly on the use case. If the branch predictor is able to speculate correctly a decent fraction of the time, the branchful version will be faster.

It kind of sucks -- this function is indeed an obviously good candidate for standardization, but there are lots of little decisions that actually matter and which have no clear answer.

isodoublet

For integers you could just use Don Knuth's midpoint algorithm: (a & b) + ((a ^ b) >> 1)

alexandernyberg

Since the midpoint of two T* can't be used as a T*, is there a use case for midpoint of pointers?

abuyoyo

never knew that denormalized numbers fail "<= hi" check

NoNameAtAll

still for integers(and if casted for any pointers too, but not sophisticated iterators) i would prefer simpler version:
//return (a>>1) + (b>>1) + (((a&1) + (b&1))>>1); //sorry before edit
return (a>>1) + (b>>1) + (a&b&1);

zdnuijbsodfteu

For integers, I bet a/2 + b/2 + (b & 1) works great.

Omnifarious

Haven't you guys found a more serious problem to talk about for one hour?

ArtyomPalvelev

CppCon 2019: Marshall Clow “std::midpoint? How Hard Could it Be?”

CppCon 2019: Marshall Clow “std::midpoint? How Hard Could it Be?”

C++Now 2019: Marshall Clow “The View from a Standard Library Implementor”

2019 EuroLLVM Developers’ Meeting: M. Clow “Changes to the C++ standard library for C++20”

Marshall Clow — Hardening the C++ standard template library

CppCon 2015: Marshall Clow “string_view'

When C++ Zero-Cost Abstraction fails: how-to Fix Your Compiler - Adrien Guinet - CppCon 2019

C++ STL numeric - std::midpoint -- Interview Question! | Modern Cpp Series Ep. 177

2018 EuroLLVM Developers’ Meeting: M. Clow “Hardening the Standard Library”

CppCon 2019: Chandler Carruth “There Are No Zero-cost Abstractions”

CppCon 2019: Jeff Garland “The C++20 Standard Library: Beyond Ranges”

Sorting Algorithms: Speed Is Found In The Minds of People - Andrei Alexandrescu - CppCon 2019

Non-conforming C++: the Secrets the Committee Is Hiding From You - Miro Knejp - CppCon 2019

J. Bialek, S. Block “Killing Uninitialized Memory: Protecting the OS Without Destroying Performance”...

Functional Programming without the Functions - Norm Birkett - CppCon 2019

unique_pseudofunction: N overloads for the price of 1 - Filipp Gelman - CppCon 2019

Maintainability and Refactoring Impact of Higher-Level Design Features - Titus Winters - CppCon 2019

How to Implement Your First Compiler Feature:The Story of Concepts in Clang - Saar Raz - CppCon 2019

A new tool for fighting against Hyrum’s law in ... concept-driven design - D. Hollman - CppCon 2019...

The Truth of a Procedure - Lisa Lippincott - CppCon 2019

Text - Corentin Jabot - CppCon 2019

Finding Strings in Large Files - Ondrej Dobias - CppCon 2019

v.~uint32_t(); - Matthis Kruse - CppCon 2019

C++ Class Natures in Their Canonical Form and How to Find Them - Peter Sommerlad - CppCon 2019

I need a volunteer who is not afraid of failure - Sy Brand - CppCon 2019