A deep dive into Flink SQL - Jark Wu, Kurt Young

Показать описание

During last two major versions (1.9 & 1.10), Apache Flink community spent lots of effort to improve the architecture for further unified batch & streaming processing. One example for that is Flink SQL added the ability to support multiple SQL planners under the same API. This talk will first discuss the motivation behind these movements, but more importantly will have a deep dive into Flink SQL. The presentation shows the unified architecture to handle streaming and batch queries and explain how Flink translates queries into the relational expressions, leverages Apache Calcite to optimize them, and generates efficient runtime code for execution. Besides, this talk will also describe the lifetime of a query in detail, how optimizer improve the plan based on relational node patterns, how Flink leverages binary data format for its basic data structure, and how does certain operator works. This would give audience better understanding of Flink SQL internals.

Speakers: Jark Wu, Kurt Young from Alibaba

Рекомендации по теме

Комментарии

Thanks for the presentation. Are there any plans that the slides are shared ?

ahmadawad

Hi, After reading data from source then only we will come to know whether there are 1000 rows or 1million rows right? then only we will decide which one to use either hash based join or broadcast based join. but how are we deciding to use broadcast hash join in physical plan?

cdinesh

Or was it using such efficient deserialization only for Tuples and not for Rows?

FlavioPompermaier

Is there an optimization here similar to whole stage code generation as in Spark?

ahmadawad

flink, will govern the real-time compute and machine learning!

qiwei

A deep dive into Flink SQL - Jark Wu, Kurt Young

Flink Deep Dive - Concepts and Real Examples

A deep dive into Flink SQL - Jark Wu, Kurt Young

Apache Flink deep dive

Sources, Sinks, and Operators: A Performance Deep Dive

Berlin Buzzwords 2015: Stephan Ewen - Apache Flink deep-dive #bbuzz

Apache Flink Deep Dive: Fault Tolerance and Parallel Dataflows - Snapshots Explained

Webinar: Deep Dive on Apache Flink State - Seth Wiesman

Building a Real-Time Data Streaming Pipeline using Apache Kafka, Flink and Postgres

AWS re:Invent 2020: Building real-time applications using Apache Flink

Caito Scherr – Better, Faster, Stronger Streaming: Your First Dive into Flink SQL

Flink Pulsar Connector Deep Dive

Diving Deep into Apache Flink with Robert Metzger | Ep. 14 | Real-Time Analytics Podcast

Intro to Stream Processing with Apache Flink | Apache Flink 101

What is Apache Flink? #softwareengineering

'Deep dive into Unbounded Data Processing Systems' by Monal Daxini

Google SWE teaches systems design | EP40: Flink in 15 Minutes, Stateful Stream Processing!

Apache Flink Architecture

Robust Stream Processing with Apache Flink

Current 2023 Day 2 Keynote: Kafka, Flink, and Beyond

Webinar: Operating an Open Source Flink and Beam Runtime on Kubernetes

Inside Apache Flink: A Conversation with Robert Metzger | Ep. 13 | Real-Time Analytics Podcast

Sink Framework Evolution in Apache Flink

Stateful Stream Processing with Flink SQL | Apache Flink 101

Apache Fink Deep Dive - Raja