filmov
tv
G-Research Distinguished Speaker Series: Apache Arrow - High Performance Columnar Data Framework
Показать описание
In the latest edition of the G-Research Distinguished Speaker Series, Wes McKinney, co-creator of Apache Arrow and creator of Python pandas, discusses the latest developments in Apache Arrow - a multi-language toolbox for accelerated data interchange and in-memory processing.
In this talk, Wes McKinney talks through the following:
Compute and Data Silos
Trends in Hardware
Apache Arrow and where it’s up to
Defragmentation Interoperability
Standard In-Memory Format Goals
Apache Arrow Data Types
Apache Arrow Streaming Binary Protocol
Zero-copy data interchange
High Performance Bridge to Storage
Arrow Flight and fast data sharing with Arrow Flight
Parallel Data Access with Arrow Flight
Arrow Flight SQL
Query Engines for Arrow
Near future: Modular Arrow Computing
Substrait: Serialised Relational Algebra
Portable Query Plans / Substrait in perspective
Analytics database Architecture
Analytics database, deconstructed
Apache Arrow 7.0.0
Coming soon with Apache Arrow
Engine Interfaces in Python
In this talk, Wes McKinney talks through the following:
Compute and Data Silos
Trends in Hardware
Apache Arrow and where it’s up to
Defragmentation Interoperability
Standard In-Memory Format Goals
Apache Arrow Data Types
Apache Arrow Streaming Binary Protocol
Zero-copy data interchange
High Performance Bridge to Storage
Arrow Flight and fast data sharing with Arrow Flight
Parallel Data Access with Arrow Flight
Arrow Flight SQL
Query Engines for Arrow
Near future: Modular Arrow Computing
Substrait: Serialised Relational Algebra
Portable Query Plans / Substrait in perspective
Analytics database Architecture
Analytics database, deconstructed
Apache Arrow 7.0.0
Coming soon with Apache Arrow
Engine Interfaces in Python