CppCon 2018: Jefferson Amstutz “Compute More in Less Time Using C++ Simd Wrapper Libraries”

Показать описание

—
—
Leveraging SIMD (Single Instruction Multiple Data) instructions are an important part of fully utilizing modern processors. However, utilizing SIMD hardware features in C++ can be difficult as it requires an understanding of how the underlying instructions work. Furthermore, there are not yet standardized ways to express C++ in ways which can guarantee such instructions are used to increase performance effectively.

Lastly, this talk will also seek to unify the greater topic of data parallelism in C++ by connecting the SIMD parallelism concepts demonstrated to other expressions of parallelism, such as SPMD/SIMT parallelism used in GPU computing.
—
Jefferson Amstutz, Software Engineer
Intel
Jeff is a Visualization Software Engineer at Intel, where he leads the open source OSPRay project. He enjoys all things ray tracing, high performance computing, clearly implemented code, and the perfect combination of git, CMake, and modern C++.
—

*-----*
*-----*

Рекомендации по теме

Комментарии

As for GPU kernels vs CPU kernels, the difference is in relative cost of memory operations compared to register calculation speed as well as size of register file. GPUs tend to have order of magnitude faster calculation while memory is on par or slower due to relatively smaller per thread caches - so you have to even more so spare memory bandwidth.
Also GPUs prefer bigger block operations than CPUs due to memory/cache architecture. That's about it.

AstralSorm

28:50 from what i understand trig functions are available only on avx512 which exists only on few xeons and, so far, very few consumer-grade CPU?

GeorgeTsiros

You dont have to modify your code at all with "vertical" vectorization... Just apply simd to all operations and enjoy free speed upgrade...
Meanwhile with horizontal, you have to rewrite your code competely for ray tracing, handle pointers of materials, reduction of closest hit and mainly recursion where paths and steps of each vectorized ray are very different.

panjak

Very good information, thank you.

The examples could be a bit more realistic. Neural networks use fundamental linear transformation Ax+b (A is matrix, x and b are vectors), 3d graphics use vectors of {x, y, z, w} (w is needed for transforms and perspective projection) along with 4x4 transform matrix multiplications.

maxxba

20:48 sorry if its noob talk but wouldnt the .insert()/.extract() methods defeat the purpose by adding a function call just to get the element ? not trying to hate but i dont get it

Theandrey

Why don't we have an "AsmCon" ? That could teach a few lessons to all the modern C++ hippsters.

llothar

“I can code faster in assembly” is the equivalent flex of “I can shift faster than your automatic”

tc

in examples there are no any handling of tails

ilnurKh

10:46 AMD GPUs do exactly 64 floats side by side though, right? Well, it's a bit iffy if you'd call that a SIMD register anyway.

msqrt

What would be the benefit of using the library over just letting the compiler autovectorize code? Modern compilers are already doing a pretty good job at that.

decayl

36:46 "Saxpy is nonsense as well" - pardon me, but SAXPY is at the core of most artificial neural networks: input*weight + bias. Just sayin'.

totalermist

CppCon 2018: Jefferson Amstutz “Compute More in Less Time Using C++ Simd Wrapper Libraries”

CppCon 2018: Jefferson Amstutz “Compute More in Less Time Using C++ Simd Wrapper Libraries”

Lightning Talk: You Too Can Have Fun with Ray Tracing! - Jeff Amstutz - CppCon 2021

CppCon 2018: Jon Kalb “This is Why We Can’t Have Nice Things”

CppCon 2018: Walter E. Brown “Thank You (I’m sorry that it’s taken me so long to say it)”

CppCon 2018: Phil Nash “You're Not as Smart as You Think You Are”

Lightning Talk: The Most Important Design Guideline - Jeff Amstutz - CppCon 2021

Lightning Talk: Deepest Darkest const - Jeff Snyder - C++ on Sea 2022

CppCon 2018: Nir Friedman “Understanding Optimizers: Helping the Compiler Help You”

CppCon 2018: Jason Turner “Applied Best Practices”

CppCon 2018: Thomas Rodgers “Bringing C++ 17 Parallel Algorithms to a standard library near you”

CppCon 2018: Anna Gringauze “Static analysis for concurrent C++ in Visual Studio”

CppCon 2018: Andrew Sutton “Concepts in 60: Everything you need to know and nothing you don't”...

CppCast Episode 176: SIMD Wrapper Libraries with Jeff Amstutz

CppCon 2018: Howard Hinnant “＜chrono＞ Then and Now”

CppCon 2018: Adi Shavit “The Salami Method for Cross Platform Development”

C++ Cryptozoology - A Compendium of Cryptic Characters :: #2 - Adi Shavit [ CppCon 2018 ]

CppCon 2018: Andreas Weis “Fixing Two-Phase Initialization”

CppCon 2018: “Implementing the C++ Core Guidelines’ Lifetime Safety Profile in Clang”

CppCon 2018: “Secure Coding Best Practices: Your First Line Is The Last Line Of Defense (2 of 2)”...

CppCon 2018: Alan Talbot “Moving Faster: Everyday efficiency in modern C++”

Compiling Vectorized Code with the Intel C++ Compiler

CppCon 2018: JeanHeyd Meneide “Scripting at the Speed of Thought: Lua and C++ with sol3”

CppCon 2018: Damien Buhl “C++ Everywhere with WebAssembly”

CppCon 2018: Jonathan Keinan “Cache Warming: Warm Up The Code”