filmov
tv
Design Principles of Distributed Systems with Dask and PySpark

Показать описание
On this week's Science Thursday, Holden Karau joins Matt Rocklin & Hugo Bowne-Anderson to discuss the design of Dask, how it compares to PySpark, and why these tradeoffs were chosen.
00:00 Going live!
01:35 Introducing Holden
03:15 What is Dask?
06:10 Princess of the Covariance Matrix
08:41 Holden's introduction to Dask
10:50 Why learn Dask in public?
12:50 Difficulties Holden faced when first learning Dask
16:37 Dask and supporting packages
19:50 Architectural differences between Spark and Dask and the impact
28:10 Local modes of Dask
32:30 How does Spark manage software environments for distributed clusters?
34:40 Dask adaptive scaling
44:45 Dask and memory
48:40 Resource management in Dask
55:55 Wrapping up
00:00 Going live!
01:35 Introducing Holden
03:15 What is Dask?
06:10 Princess of the Covariance Matrix
08:41 Holden's introduction to Dask
10:50 Why learn Dask in public?
12:50 Difficulties Holden faced when first learning Dask
16:37 Dask and supporting packages
19:50 Architectural differences between Spark and Dask and the impact
28:10 Local modes of Dask
32:30 How does Spark manage software environments for distributed clusters?
34:40 Dask adaptive scaling
44:45 Dask and memory
48:40 Resource management in Dask
55:55 Wrapping up