Big Data Pipeline Design and Tuning in PySpark by Rockie Yang

preview_player
Показать описание
PySpark is a great tool for doing big data ETL pipeline. While designing a big data pipeline, which is easy to maintain with a holistic view, simple to spot bottleneck is difficult. Not to say enable analytics on ETL pipelines. Rockie Yang will share his experiences on build effective ETL pipeline with PySpark.

Audience level: Intermediate

Speaker: Rockie Yang
Рекомендации по теме