Managing Memory in PySpark the new rlimit support -- or one less OOM

preview_player
Показать описание
A quick look at managing memory in PySpark with the new (in master, 2.4 targeted) rlimit support. The original plan for this went kind of sideways when I didn't realize I had not updated my SPARK_HOME on the YARN cluster so it was using the wrong version of Spark but we did manage to get it working in Kubernetes.

Рекомендации по теме