Reactive Summit 2020: Ajit Koti, Tale of Stateful Stream to Stream Processing

Показать описание

Streaming engines like Apache Flink are redefining how we process data. Flink provides the opportunity to extract, transform, and write data with ease matching that of batch data processing frameworks. There are plenty of known and proven use cases of how to convert a single batch job into a streaming job. However, there are quite many challenges when we want to convert a stateful end-to-end batch workflow to multiple stateful stream jobs. Netflix processes payment for 180M+ members across 190 countries. Payment processing and transaction data is very critical for measuring operational health and performance of our payments platform. We decided to move the existing batch workflow completely to stream. Things started to get exciting when we wanted to introduce multiple streaming jobs with zero data loss and high accuracy. In this talk, we describe how we converted a conventional complex stateful batch workflow to a multi-step stateful streaming workflow at Netflix using Flink. You’ll learn about 1)Design and architecture involving multiple stateful streaming jobs 2)Managing schema evolution using Avro for stateful real-time applications 3)Sharing code between Flink and Spark for any fallback batch processing. 4) Handling cascading impact when events arrive out of order 5) Landing processed data in real-time into multiple sinks such as Iceberg and Druid.

Рекомендации по теме

Reactive Summit 2020: Ajit Koti, Tale of Stateful Stream to Stream Processing

Reactive Summit 2020: Ajit Koti, Tale of Stateful Stream to Stream Processing

Scale By The Bay 2019: Ajit Koti Interview

Scale By The Bay 2019: Rashmi Shamprasad & Ajit Koti, Finding Needles In Big Data Haystacks....

code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Processing System

Apache Flink Reactive Mode preview

How is the construction industry becoming autonomous? | Expert Talks with Bibhrajit Halder

USENIX ATC '20 - FineStream: Fine-Grained Window-Based Stream Processing on CPU-GPU Integrated....

Data science at scale using Apache Flink: Dynamic model serving and real-time feature generation

Keynote | Stream Processing at LinkedIn: Then and Now - Renu Tewari

DiffStream: Differential Output Testing for Stream Processing Programs

2020 Cloud Computing and Big Data Short Lecture 15 Big Data Streaming Tools and Applications 💻

Introduction to Stream Processing with Apache Flink - Marta Paes, Ververica

Ep 176 (ft. Bibhrajit Halder)

Learn By Example : Apache Flink - learn Apache Flink

Codemotion 2020: Big Data codeless products Vs custom code writing

Stream Processing Performance Comparison Under Limited Resources, Spark and Flink

Tech Talk with XM Cyber: Big data, Stream processing & Flink - challenges, issues & lessons ...

DS320.34 Spark Streaming: Stateful Transformations | DataStax Enterprise Analytics

'Privacy Governance & Explainability in ML/AI' by Jared Maslin

My Bhutan Story | Road Trip to 'Land of Thunder Dragon' | ekmusafirr | TVS AOG

Scale By The Bay 2019: Panel Discussion, Who Needs Serverless?

Apache NiFi RCE

Apache Flink Architecture | Flink Tutorial