filmov
tv
2017 EuroLLVM Developers’ Meeting: U. Weigand “LLVM performance optimization for z Systems”
![preview_player](https://i.ytimg.com/vi/Dub769wZDAk/maxresdefault.jpg)
Показать описание
—
LLVM performance optimization for z Systems - Ulrich Weigand, IBM
—
Since we initially added support for the IBM z Systems line of mainframe processors back in 2013, one of the main goals of ongoing LLVM back-end development work has been to improve the performance of generated code.
Now, we have for the first time reached parity with GCC: the latest benchmark results of LLVM 4.0 match those measured with current GCC.
In this talk I'll report on the most important changes we had to make to the back-end to achieve this goal. On the one hand, this includes changes to fully exploit all relevant instruction-set architecture features to make best possible use of z/Architecture instructions, e.g. including support for condition code values, the register high-word facility, and conditional execution.
On the other hand, I'll talk about some of the changes necessary to tune generated code for the micro-architecture of selected z Systems processors, in particular z13. This includes considerations like instruction scheduling, but also tuning loop unrolling, vectorization, and other instruction selection choices.
Finally, I'll show some opportunities for even further performance optimization, with focus on those where we are currently unable to fully exploit some hardware capabilities due to limitations in common-code parts of LLVM's code generator.
—
LLVM performance optimization for z Systems - Ulrich Weigand, IBM
—
Since we initially added support for the IBM z Systems line of mainframe processors back in 2013, one of the main goals of ongoing LLVM back-end development work has been to improve the performance of generated code.
Now, we have for the first time reached parity with GCC: the latest benchmark results of LLVM 4.0 match those measured with current GCC.
In this talk I'll report on the most important changes we had to make to the back-end to achieve this goal. On the one hand, this includes changes to fully exploit all relevant instruction-set architecture features to make best possible use of z/Architecture instructions, e.g. including support for condition code values, the register high-word facility, and conditional execution.
On the other hand, I'll talk about some of the changes necessary to tune generated code for the micro-architecture of selected z Systems processors, in particular z13. This includes considerations like instruction scheduling, but also tuning loop unrolling, vectorization, and other instruction selection choices.
Finally, I'll show some opportunities for even further performance optimization, with focus on those where we are currently unable to fully exploit some hardware capabilities due to limitations in common-code parts of LLVM's code generator.
—