filmov
tv
Building Data Pipelines with Open Source by Rustam Mehmandar
Показать описание
@CJUG February 16, 2022
ABSTRACT:
A few years ago, moving data between applications and datastores included expensive monolithic stacks from large software vendors with little flexibility. Now, with frameworks such as Apache Beam and Apache Airflow, we can schedule and run data processing jobs for both streaming and batch with the same underlying code. This presentation demonstrates the concepts of how this can glue your applications together and shows how we can run a data pipeline locally with Apache Beam, and using Dataflow, and BigQuery by changing a few lines of Java in our Apache Beam code. We will be looking at how this can be deployed in different cloud solutions.
🗣SPEAKER BIO🗣
Rustam Mehmandarov
Passionate computer scientist. Java Champion and Google Developers Expert for Cloud. Public speaker. Ex-leader of JavaZone and Norwegian JUG – javaBin.
ABSTRACT:
A few years ago, moving data between applications and datastores included expensive monolithic stacks from large software vendors with little flexibility. Now, with frameworks such as Apache Beam and Apache Airflow, we can schedule and run data processing jobs for both streaming and batch with the same underlying code. This presentation demonstrates the concepts of how this can glue your applications together and shows how we can run a data pipeline locally with Apache Beam, and using Dataflow, and BigQuery by changing a few lines of Java in our Apache Beam code. We will be looking at how this can be deployed in different cloud solutions.
🗣SPEAKER BIO🗣
Rustam Mehmandarov
Passionate computer scientist. Java Champion and Google Developers Expert for Cloud. Public speaker. Ex-leader of JavaZone and Norwegian JUG – javaBin.