Spark ETL with Lakehouse | Apache HUDI

preview_player
Показать описание
In this video, we are discussing Spark ETL with Lakehouse | Apache HUDI

Meduim Blob:

Github Repo:

Blogs Link

YouTube Playlist:

We will discuss Spark ETL pipelines with all of the different types of sources. Detailed Plan:

0. Chapter0 - Spark ETL with Files (JSON | Parquet | CSV | ORC | AVRO)
1. Chapter1 - Spark ETL with SQL Database (MySQL | PostgreSQL)
2. Chapter2 - Spark ETL with NoSQL Database (MongoDB)
3. Chapter3 - Spark ETL with Azure (Blob | ADLS)
4. Chapter4 - Spark ETL with AWS (S3 bucket)
5. Chapter5 - Spark ETL with Hive tables
6. Chapter6 - Spark ETL with APIs
7. Chapter7 - Spark ETL with Lakehouse (Delta)
8. Chapter8 - Spark ETL with Lakehouse (HUDI)
9. Chapter9 - Spark ETL with Lakehouse (Apache Iceberg)
10. Chapter10 - Spark ETL with Lakehouse (Delta vs Iceberg vs HUDI)
11. Chapter11 - Spark ETL with Lakehouse (Delta table Optimization)
12. Chapter12 - Spark ETL with Lakehouse (Apache Kafka)
13. Chapter13 - Spark ETL with GCP (Big Query)
14. Chapter 14 - Spark ETL with Hadoop (Apache Sqoop)
Рекомендации по теме