Universal Machine Learning with Apache Beam - Maximilian Michels & Robert Bradshaw

preview_player
Показать описание
Flink Forward Berlin, September 2018 #flinkforward

Apache Beam is a unified batch and streaming programming model. Apache Beam runs on various execution backends, such as Apache Flink, Apache Spark, Apache Samza, Apache Gearpump, Apache Hadoop, and Google Cloud Dataflow.

Up until recently, Java was the predominant language for writing Beam Jobs. However, thanks to the Beam portability project you can now write your pipelines in other languages (Java/Scala/Python/Go/SQL). The benefit of this is simple - Not only can you use your favorite programming language to write data processing pipelines but also all of its libraries.

After a brief introduction to Apache Beam, we want to explain how cross-language portability was made possible. Further, we want to showcase the portability with TFX, a Python library for machine learning with TensorFlow.

This talk is for everyone who wants to learn about Apache Beam, its API, and its portability layer. No machine learning knowledge required.

Рекомендации по теме