How to build stream data pipeline with Apache Kafka and Spark Structured Streaming - PyCon SG 2019

preview_player
Показать описание
Speaker: Takanori Aoki, Data Scientist, HOOQ

Objective: Main purpose of this session is to help audience be familiar with how to develop stream data processing application by Apache Kafka and Spark Structured Streaming in order to encourage them to start playing with these technologies. Description: In Big Data era, massive amount of data is generated at high speed by various types of devices. Stream processing technology plays an important role so that such data can be consumed by realtime application. In this talk, Takanori will present how to implement stream data pipeline and its application by using Apache Kafka and Spark Structured Streaming with Python. He will be elaborating on how to develop application rather than explaining system architectural design in order to help audience be familiar with stream processing implementation by Python. Takanori will introduce examples of application using Tweet data and pseudo-data of mobile device. In addition, he will also explain how to integrate streaming data into other data store technologies such as Apache Cassandra and Elasticsearch. Note: - Python codes to build these applications will be uploaded on GitHub.

About the speaker:

Produced by Engineers.SG
Рекомендации по теме
Комментарии
Автор

Thx for the presentation. Can I find the source code somewhere?

rezahamzeh
Автор

Amazing presentation, how can I run the application ?

youssefsassi
Автор

Thank you for uploading this and thanks to Takanori for amazing content

onewithsixonewithsix
join shbcf.ru