Time Series Analytics with Spark: Spark Summit East talk by Simon Ouellette

preview_player
Показать описание
spark-timeseries is a Scala / Java / Python library for interacting with time series data on Apache Spark.

Time-series are an important part of data science applications, but are notoriously difficult in the context of distributed systems, due to their sequential nature. Getting this right is therefore a challenging but important element of progress in the universe of distributed systems applied to data science.

This talk will cover the current overall design of spark-timeseries, the current functionalities, and will provide some usage examples. Because the project is still at an early stage, the talk will also cover the current weaknesses and future improvements that are in the spark-timeseries project roadmap.
Рекомендации по теме