filmov
tv
Koalas: Pandas API on Apache Spark - PyCon SG 2019
![preview_player](https://i.ytimg.com/vi/k9vZ9oylL6Q/maxresdefault.jpg)
Показать описание
Speaker: Ben Sadeghi, Solutions Architect, Databricks
Pandas is the de facto standard (single-node) DataFrame implementation in Python, while Spark is the de facto standard for big data processing. With the recently open-sourced Koalas package, you can be immediately productive with Spark, with no learning curve, if you are already familiar with pandas, and have a single codebase that works both with pandas (tests, smaller datasets) and with Spark (distributed datasets). In this talk, we'll go through the basics of Koalas, along with demos.
About the speaker:
Produced by Engineers.SG
Pandas is the de facto standard (single-node) DataFrame implementation in Python, while Spark is the de facto standard for big data processing. With the recently open-sourced Koalas package, you can be immediately productive with Spark, with no learning curve, if you are already familiar with pandas, and have a single codebase that works both with pandas (tests, smaller datasets) and with Spark (distributed datasets). In this talk, we'll go through the basics of Koalas, along with demos.
About the speaker:
Produced by Engineers.SG