Building data pipelines with Java and open source

preview_player
Показать описание
Speaker: Rustam Mehmandarov
Recorded: 2020-10-06

A few years ago, moving data between applications and data stores included expensive monolithic stacks from large software vendors with little flexibility. Now, with frameworks such as Apache Beam and Apache Airflow, we can schedule and run data processing jobs for both streaming and batch with the same underlying code. This presentation demonstrates the concepts of how this can glue your applications together and shows how we can run data pipelines as Java code, the use cases for such pipelines, and how we can move them from local machines to the cloud solutions by changing just a few lines of Java in our Apache Beam code.

Rustam Mehmandarov lives and works in Oslo, Norway, working as a chief engineer and consultant specializing in Java platform, and a competency network coordinator at work.

In his spare time, he contributes to several local developer communities. Rustam is passionate about open source and sharing his knowledge with others. He is a Google Developer Expert (GDE) for Cloud and a Java Champion. Since 2017 he had been organiser of GDG Cloud Oslo, Norway. Previously, he has been leading JavaZone and the Norwegian JUG – javaBin.

He is a frequent speaker at both national and international conferences and events. You can find Rustam on Twitter as @rmehmandarov.

Organized by: Java User Group Switzerland
Рекомендации по теме
Комментарии
Автор

Hello, thanks a lot for this video. Please, can you share a GitHub repository for this project ?

mustapharaimilawal
Автор

This made making data pipelines seem approachable. Thank you.

wesNeill
Автор

How did you overcome the pretty-print style JSON API change? Did you have to transform it to single lines somehow? I'm working on an application that needs to take standard json.

wesNeill
visit shbcf.ru