filmov
tv
Uncluster Your Data Science Using Vaex • Maarten Breddels & Jovan Veljanoski • GOTO 2021

Показать описание
This presentation was recorded at GOTO Copenhagen 2021. #GOTOcon #GOTOcph
ABSTRACT
Would you like to build an snappy dashboard visualising hundreds of millions of data points, or interactively explore hundreds of Gigabytes of data, all of that using a single machine?
Meet Vaex - an out of core DataFrame library in Python that can do all the typical data manipulations, filtering, and aggregations on a billion rows in real time & on a single computer. This approach empowers your team and allows them to focus much more on the business problem, as it removes the large DevOps overhead of configuring and maintaining a cluster.
Vaex fully supports Apache Arrow, which both facilitates the interoperability with other systems and enables storage and manipulation of more complex data structures like lists [...]
TIMECODES
00:00 Intro
00:50 Motivation
05:20 Vaex
06:14 Concepts: Memory mapping
07:56 Concepts: Column based storage
09:37 Concepts: No memory copies
10:50 Concepts: Compute & expression system
13:30 Demo
32:32 In production
34:23 In the wild
35:00 In production: Dash example
37:13 Summary
37:58 Outro
Download slides and read the full abstract here:
RECOMMENDED BOOKS
#Vaex #ApacheArrow #DataScience #AI #ML #ArtificialIntelligence #MachineLearning #DataFrame #Programming #VaexIO #Astronomy
Looking for a unique learning experience?
SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
ABSTRACT
Would you like to build an snappy dashboard visualising hundreds of millions of data points, or interactively explore hundreds of Gigabytes of data, all of that using a single machine?
Meet Vaex - an out of core DataFrame library in Python that can do all the typical data manipulations, filtering, and aggregations on a billion rows in real time & on a single computer. This approach empowers your team and allows them to focus much more on the business problem, as it removes the large DevOps overhead of configuring and maintaining a cluster.
Vaex fully supports Apache Arrow, which both facilitates the interoperability with other systems and enables storage and manipulation of more complex data structures like lists [...]
TIMECODES
00:00 Intro
00:50 Motivation
05:20 Vaex
06:14 Concepts: Memory mapping
07:56 Concepts: Column based storage
09:37 Concepts: No memory copies
10:50 Concepts: Compute & expression system
13:30 Demo
32:32 In production
34:23 In the wild
35:00 In production: Dash example
37:13 Summary
37:58 Outro
Download slides and read the full abstract here:
RECOMMENDED BOOKS
#Vaex #ApacheArrow #DataScience #AI #ML #ArtificialIntelligence #MachineLearning #DataFrame #Programming #VaexIO #Astronomy
Looking for a unique learning experience?
SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
Комментарии