What is a Lakehouse? Data Streaming and Batch Analytics.

preview_player
Показать описание
Building Cloud-native Data Warehouses and Data Lakes with Data Streaming - Part 2: Data Lakehouse for Streaming and Analytics.

The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems. Storing data at rest for reporting and analytics requires different capabilities and SLAs than continuously processing data in motion for real-time workloads. Many open-source frameworks, commercial products, and SaaS cloud services exist. Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a short video series. Learn how to build a modern data stack with cloud-native technologies.

Part 1: Data Analytics at Rest vs. Data Streaming in Motion
Part 2: Data Lakehouse for Streaming and Analytics
Part 3: A Hybrid Cloud-native Lakehouse Project for Predictive Maintenance

The following topics are covered:
- Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
- Data Streaming for Data Ingestion into the Data Warehouse and Data Lake
- Data Warehouse Modernization: From Legacy On-Premise to Cloud-Native Infrastructure
- Case Studies: Cloud-native Data Streaming for Data Warehouse Modernization
- Lessons Learned from Building a Cloud-Native Data Warehouse
- Vendor mapping: How do various data analytics software vendors, technologies, and cloud providers fit? (e.g. Databricks, Snowflake, Google BigQuery, AWS RedShift, Azure Synapse, Confluent, Apache Spark, Apache Kafka, TensorFlow, etc.)

The following blog series also covers these topics:
Рекомендации по теме