Harnessing Coral and Iceberg for Advanced Incremental View Maintenance (LinkedIn)

preview_player
Показать описание
The rapid expansion of data lakes presents a significant challenge in terms of increased compute overhead and data processing delays. Incremental view maintenance emerges as a crucial strategy for enhancing data lake efficiency by avoiding duplicate computation, reducing both cost and latency. However, crafting incremental logic can be a complex task, especially when execution engines lack built-in support for incremental maintenance. In this talk, Walaa and Aastha present how LinkedIn effectively uses Coral's intermediate representation and relational algebra, along with Apache Iceberg, to implement an easy-to-adopt automatic incremental view maintenance system. They will dive into the details of this workflow, and show how these techniques democratize incremental maintenance, making it an add-on feature for modern data lakes and execution engines. They will walk through an end-to-end UX demo based on dbt and Apache Spark to show the effectiveness and performance benefits of this approach.
Рекомендации по теме