Spark Streaming Advanced Training by Lead Developer Tathagata Das (Databricks)

preview_player
Показать описание

// Overview //

• What is Spark Streaming?
• Why Spark Streaming?
• Integration with Batch Processing
• Fault-tolerant Streaming Processing
• Existing Streaming Systems
• Spark Streaming
• Programming Model - DStream
• Example - Get hashtags from Twitter
• Spark Languages
• Window-based Transformations
• Arbitrary Stateful Computations
• Arbitrary Combinations of Batch and Streaming Computations
• DStreams + RDDs = Power
• Databricks Keynote Demo
• Advantages of a Unified Stack
• Fault Tolerance
• Input Sources
• Future Directions
• Conclusion

// About the Presenter //

Tathagata Das is an Apache Spark Committer and a member of the PMC. He’s the lead developer behind Spark Streaming, and is currently employed at Databricks. Before Databricks, you could find him at the AMPLab of UC Berkeley, researching datacenter frameworks and networks with professors Scott Shenker and Ion Stoica.

Follow T.D. on -

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.

Connect with us:
Рекомендации по теме
Комментарии
Автор

Hello sir,
I know this is 5 years old content. But one small question can you name few use cases where RDDs can be only options ?

To be specific -> Are there are any use cases where RDDs can be helpful over DF and DS ?

Thanks in advance.

Tony_