A Deep Dive into Query Execution Engine of Spark SQL - Maryann Xue

preview_player
Показать описание
Spark SQL enables Spark to perform efficient and fault-tolerant relational query processing with analytics database technologies. The relational queries are compiled to the executable physical plans consisting of transformations and actions on RDDs with the generated Java code. The code is compiled to Java bytecode, executed at runtime by JVM and optimized by JIT to native machine code at runtime. This talk will take a deep dive into Spark SQL execution engine. The talk includes pipelined execution, whole-stage code generation, UDF execution, memory management, vectorized readers, lineage based RDD transformation and action.

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.

Connect with us:
Рекомендации по теме
Комментарии
Автор

Excellent session, very well explained

αλήθεια-σκ
Автор

what is the difference between a normal stage in job and a WSCG . Do multiple pipelines within a single WSCG correspond to two separate stages

megharaina