Koalas: pandas APIs on Apache Spark

preview_player
Показать описание
# Abstract
In this talk, Reynold will present Koalas, a new open source project that was announced at the Spark + AI Summit in April. Koalas is a Python package that implements the pandas API on top of Apache Spark, to make the pandas API scalable to big data. Using Koalas, data scientists can make the transition from a single machine to a distributed environment without needing to learn a new framework.

Reynold will demonstrate Koalas' new functionalities since its initial release, discuss its roadmaps, and how he envisions Koalas could become the standard API for large scale data science.

# Speaker Bio
Reynold Xin is a cofounder and Chief Architect at Databricks. In the open source community, Reynold is known as a top contributor to the Apache Spark project, having designed many of its core user-facing APIs and execution engine features. Reynold received a PhD in Computer Science from UC Berkeley, where he worked on large-scale data processing systems.
Рекомендации по теме
Комментарии
Автор

I am using my special gift of imagination to see his code in my mind.

TheBjjninja
Автор

You realize you are not sharing your screen right?

mjmurphy