Geospatial Options in Apache Spark

preview_player
Показать описание
Geospatial data appears to be simple right up until the part when it becomes intractable. There are many gotcha moments with geospatial data in spark and we will break those down in our talk. Users who are new to geospatial analysis in spark will find this portion useful as projections, geometry types, indices, and geometry storage can cause issues. We will begin by discussing the basics of geospatial data and why it can be so challenging. This will be brief and will be in the context of how geospatial data can cause scaling problems in spark. Critically, we will show how we have approached these issues to limit errors and reduce cost. There are many geospatial packages available within Spark. We have tried many of them and will discuss the pros and cons of each using common examples across libraries. New users will benefit from this discussion as each library has advantages in specific scenarios. Lastly, we will discuss how we migrate geospatial data. This will include our best practices for ingesting geospatial data as well as how we store it for long term use. Users may be specifically interested in our evaluation of spatial indexing for rapid retrieval of records.

About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Connect with us:
Рекомендации по теме
Комментарии
Автор

Is the map demo on YouTube? Would love to see it

MatSchaffer-Com
Автор

Function ST_Difference is available on spark o sedona aka geospark ?

welcome to shbcf.ru