Distributed Data Show Episode 86: Awkward-Free Spark Development with Holden Karau

preview_player
Показать описание
Patrick talks with Holden Karau, Developer Advocate at Google, about ways to incorporate Spark in your application without impacting performance or having to learn Scala.

Highlights:
0:00 - The awkwardness of using an analytic framework like Spark in your application
0:40 - That to do when you have analytic requirements in your application - stream data to Spark, run jobs, write data back to your data store where it can be consumed realtime. What not to do: query Spark on the critical path
3:53 - A great example: using Spark to perform rollups of IoT data in the background and writing back into Cassandra. This allows you to access both summary and current data at realtime.
5:12 - How a whiteboard would come in handy on future episodes, and how to hack your boss :)
6:27 - Spark Streaming is one of a number of streaming solutions available, it's continuing to evolve in Spark 3
8:25 - 5G wireless is low latency, high throughput which is only going to increase the quantity for streaming data
9:30 - Running streaming Fortran in the cloud - is this a good idea?
10:30 - Wrapping up

ABOUT DATASTAX ENTERPRISE 6
DataStax powers the Right-Now Enterprise with the always-on, distributed cloud database built on Apache Cassandra™ and designed for hybrid cloud. DataStax Enterprise 6 (DSE 6) includes industry-leading performance, self-driving operational simplicity, and robust analytics.

CONNECT WITH DATASTAX

ABOUT DATASTAX ACADEMY
Рекомендации по теме