filmov
tv
Using Template Magic to Automatically Generate Hybrid CPU/GPU-Code - Elmar Westphal [ CppCon 2018 ]
Показать описание
—
—
In this talk you’ll learn how you can write code that will either compile into a CPU based loop or into a special kind of function called “kernel" to be executed on a GPU. You’ll get an introduction into the memory- and threading-models of recent GPUs and are provided with examples for (mostly) simple helper templates to manage them. You can test and debug your code on CPU and scale out later. In the end, you’ll be able to parallelise operations on vectors without having to think much about the architecture. Template magic will take of that for you.
Note: there are several ways to leverage the compute power of GPUs for your applications. There are pragma-based approaches like OpenACC or recent versions of OpenMP. Or you can take more control and use approaches like Nvidia’s CUDA, AMD’s similar HIP or the latest versions of OpenCL. All of the latter are based on subsets of the C++-14 standard with extensions to manage the execution of code (at least) on GPUs. This session will cover a CUDA-C++ based approach, but the techniques shown should be applicable to other models as well.
—
Elmar Westphal, Forschungszentrum Juelich
Scientific Programmer
Elmar Westphal has been working as a programmer and cluster architect at Forschungszentrum Juelich for more than 15 years. He's most recently ported simulation programs from different fields of computational physics to single- and multi-GPU systems and developed CUDA-based building blocks, libraries, and applications, mostly for molecular dynamics and micromagnetism simulations.
—
—
*-----*
*--*
*-----*
—
In this talk you’ll learn how you can write code that will either compile into a CPU based loop or into a special kind of function called “kernel" to be executed on a GPU. You’ll get an introduction into the memory- and threading-models of recent GPUs and are provided with examples for (mostly) simple helper templates to manage them. You can test and debug your code on CPU and scale out later. In the end, you’ll be able to parallelise operations on vectors without having to think much about the architecture. Template magic will take of that for you.
Note: there are several ways to leverage the compute power of GPUs for your applications. There are pragma-based approaches like OpenACC or recent versions of OpenMP. Or you can take more control and use approaches like Nvidia’s CUDA, AMD’s similar HIP or the latest versions of OpenCL. All of the latter are based on subsets of the C++-14 standard with extensions to manage the execution of code (at least) on GPUs. This session will cover a CUDA-C++ based approach, but the techniques shown should be applicable to other models as well.
—
Elmar Westphal, Forschungszentrum Juelich
Scientific Programmer
Elmar Westphal has been working as a programmer and cluster architect at Forschungszentrum Juelich for more than 15 years. He's most recently ported simulation programs from different fields of computational physics to single- and multi-GPU systems and developed CUDA-based building blocks, libraries, and applications, mostly for molecular dynamics and micromagnetism simulations.
—
—
*-----*
*--*
*-----*
Комментарии