PySpark in Apache Spark 3.3 and Beyond

preview_player
Показать описание
PySpark has rapidly evolved with the momentum of Project Zen introduced in Apache Spark 3.0. We improved error messages, added type hints for autocompletion, implemented visualization, etc. Most importantly, Pandas API on Spark was introduced from Apache Spark 3.2 which exposes the pandas API that runs on Apache Spark, and the Pandas API on Spark has gained a lot of popularity.

In this talk, we will introduce what is new in PySpark at Apache Spark 3.3, and what is next beyond Apache Spark 3.3 with the current effort and roadmap in PySpark.

Connect with us:
Рекомендации по теме