Giles Weaver & Ian Ozsvald - Pandas 2, Dask or Polars? Tackling larger data on a single machine

preview_player
Показать описание

Pandas 2 brings new Arrow data types, faster calculations and better scalability. Dask scales Pandas across cores. Polars is a new competitor to Pandas designed around Arrow with native multicore support. Which should you choose for modern research workflows? We'll solve a "just about fits in ram" data task using the 3 solutions, talking about the pros and cons so you can make the best choice for your research workflow. You'll leave with a clear idea of whether Pandas 2, Dask or Polars is the tool for your team to invest in.

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.

Рекомендации по теме
Комментарии
Автор

This is possibly the best (worst) way to make a technical talk exciting, literally burst into flames!

AadidevSooknananNXS
Автор

Why’d it end suddenly?? Very weird stuff.

JOHNSMITH-verq
Автор

Did the speaker intentionally used .query() ?
It is one of the slowest method in Pandas.

pietraderdetective
Автор

Crappy talk. Limited information. Tons of extraneous nonsense. Nothing on dask

imtryinghere
Автор

While having a baby is a big thing in the parents' life, it has no relevance to PyData.

ringpolitiet