Streaming Machine Learning with Apache Kafka and TensorFlow

preview_player
Показать описание
Streaming Machine Learning with Apache Kafka and TensorFlow without the need for a data store / data lake like AWS S3 or HDFS.

Github Project (Real time analytics with 100.000 connected cars, Kafka and TensorFlow I/O):

More content about Apache Kafka and Machine Learning:

Machine Learning is separated into model training and model inference. ML frameworks typically load historical data from a data store like HDFS or S3 to train models. This talk shows how you can completely avoid such a data store by ingesting streaming data directly via Apache Kafka from any source system into TensorFlow for model training and model inference using the capabilities of “TensorFlow I/O” add-on.

The talk compares this modern streaming architecture to traditional batch and big data alternatives and explains benefits like the simplified architecture, the ability of reprocessing events in the same order for training different models, and the possibility to build a scalable, mission-critical, real time ML architecture with muss less headaches and problems.

Key takeaways for the audience:
- Scalable open source Machine Learning infrastructure
- Streaming ingestion into TensorFlow without the need for another data store like HDFS or S3 (leveraging TensorFlow I/O and its Kafka plugin)
- Stream Processing using analytic models in mission-critical deployments to act in Real Time
- Learn how Apache Kafka open source ecosystem including Kafka Connect, Kafka Streams and KSQL help to build, deploy, score and monitor analytic models
- Comparison and trade-offs between this modern streaming approach and traditional batch model training infrastructures
Рекомендации по теме
Комментарии
Автор

Very interesting, useful and well organized presentation, thanks a lot!

ericbarbier
Автор

Excellent presentation and organized very nicely. Thanks for putting this out.

subramaniank
Автор

@23:24, if there is no storage in the middle and the streaming event can be used for training, does that mean each event is going to incrementally(stochastically) train the model and then it will also be lost? as in, there will be on static collection of data that has been used to train certain model? I am a bit confused. But overall thank you so much for the great insight!!

davidoh
visit shbcf.ru