GPU-accelerating UDFs in PySpark with Numba and PyGDF

preview_player
Показать описание
AnacondaCon 2018. Keith Kraus & Joshua Patterson. With advances in computer hardware such as 10 gigabit network cards, infiniband, and solid state drives all becoming commodity offerings, the new bottleneck in big data technologies is very commonly the processing power of the CPU. In order to meet the computational demand desired by users, enterprises have had to resort to extreme scale out approaches just to get the processing power they need. One of the most well known technologies in this space, Apache Spark, has numerous enterprises publicly talking about the challenges in running multiple 1000+ node clusters to give their users the processing power they need. This talk is based on work completed by NVIDIA’s Applied Solutions Engineering team. Attendees will learn how they were able to GPU-accelerate UDFs in PySpark using open source technologies such as Numba and PyGDF, the le
Рекомендации по теме
Комментарии
Автор

A great presentation ! is there somewhere where we can find code examples to test? thanks in advance

abdallahaguerzame
Автор

Well presented. Wondering if any examples/clues could have been provided running Java/Scala based UDF implementations on GPUs via pyGDF? I'm assuming a lot of people may have already converted their UDF implementations to Java/Scala to gain performance on CPUs.

FahadSheikh
welcome to shbcf.ru