filmov
tv
Distributed Data Show Episode 86: Awkward-Free Spark Development with Holden Karau
Показать описание
Patrick talks with Holden Karau, Developer Advocate at Google, about ways to incorporate Spark in your application without impacting performance or having to learn Scala.
Highlights:
0:00 - The awkwardness of using an analytic framework like Spark in your application
0:40 - That to do when you have analytic requirements in your application - stream data to Spark, run jobs, write data back to your data store where it can be consumed realtime. What not to do: query Spark on the critical path
3:53 - A great example: using Spark to perform rollups of IoT data in the background and writing back into Cassandra. This allows you to access both summary and current data at realtime.
5:12 - How a whiteboard would come in handy on future episodes, and how to hack your boss :)
6:27 - Spark Streaming is one of a number of streaming solutions available, it's continuing to evolve in Spark 3
8:25 - 5G wireless is low latency, high throughput which is only going to increase the quantity for streaming data
9:30 - Running streaming Fortran in the cloud - is this a good idea?
10:30 - Wrapping up
ABOUT DATASTAX ENTERPRISE 6
DataStax powers the Right-Now Enterprise with the always-on, distributed cloud database built on Apache Cassandra™ and designed for hybrid cloud. DataStax Enterprise 6 (DSE 6) includes industry-leading performance, self-driving operational simplicity, and robust analytics.
CONNECT WITH DATASTAX
ABOUT DATASTAX ACADEMY
Highlights:
0:00 - The awkwardness of using an analytic framework like Spark in your application
0:40 - That to do when you have analytic requirements in your application - stream data to Spark, run jobs, write data back to your data store where it can be consumed realtime. What not to do: query Spark on the critical path
3:53 - A great example: using Spark to perform rollups of IoT data in the background and writing back into Cassandra. This allows you to access both summary and current data at realtime.
5:12 - How a whiteboard would come in handy on future episodes, and how to hack your boss :)
6:27 - Spark Streaming is one of a number of streaming solutions available, it's continuing to evolve in Spark 3
8:25 - 5G wireless is low latency, high throughput which is only going to increase the quantity for streaming data
9:30 - Running streaming Fortran in the cloud - is this a good idea?
10:30 - Wrapping up
ABOUT DATASTAX ENTERPRISE 6
DataStax powers the Right-Now Enterprise with the always-on, distributed cloud database built on Apache Cassandra™ and designed for hybrid cloud. DataStax Enterprise 6 (DSE 6) includes industry-leading performance, self-driving operational simplicity, and robust analytics.
CONNECT WITH DATASTAX
ABOUT DATASTAX ACADEMY