filmov
tv
Architecting for Data Quality in the Lakehouse with Delta Lake and PySpark
Показать описание
Join us for a live tech talk and learn about architecting for data quality in the lakehouse with delta Lake and PySpark. After the presentation, we’ll have time for questions. Excited to have you join us!
From null values and duplicate rows to modeling errors and schema changes, data can break for millions of reasons. To combat this, teams are increasingly adopting best practices from DevOps and software engineering to identify, resolve, and even prevent this "data downtime" from happening in the first place. Join Prateek Chawla and Ryan Kearns as they walk through how data and ML engineers can solve for data quality across the data lakehouse by applying data observability techniques. Topics to be discussed include: how to optimize for data reliability across your lakehouse's metadata, storage, and query engine tiers, building your own data observability monitors with PySpark, and the role of tools like Delta Lake to scale this design.
Links:
From null values and duplicate rows to modeling errors and schema changes, data can break for millions of reasons. To combat this, teams are increasingly adopting best practices from DevOps and software engineering to identify, resolve, and even prevent this "data downtime" from happening in the first place. Join Prateek Chawla and Ryan Kearns as they walk through how data and ML engineers can solve for data quality across the data lakehouse by applying data observability techniques. Topics to be discussed include: how to optimize for data reliability across your lakehouse's metadata, storage, and query engine tiers, building your own data observability monitors with PySpark, and the role of tools like Delta Lake to scale this design.
Links:
Architecting for Data Quality in the Lakehouse with Delta Lake and PySpark
Data Architecture Strategies Data Quality Best Practices
Data Architecture Strategies: Data Quality Best Practices
Designing a Data Analytics Architecture with Data Quality in mind
Data Architecture Strategies: Data Quality Best Practices
Data Quality Explained
Data Architecture Strategies: Webinar: Data Quality Best Practices
What is Data Pipeline? | Why Is It So Popular?
What's Hiding in Your DATA That's Costing You Money?
Data Architecture Strategies: Data Quality Best Practices
Data Governance Explained in 5 Minutes
Data Architecture Strategies: Data Quality Best Practices
What is a Headless Data Architecture?
Data Lake Architecture
Data Architecture Strategies: Data Quality Best Practices
Data Architecture vs Management vs Governance
How to Become a Data Architect
Introduction to Data Governance (Data Architecture | Data Governance)
Assuring Data Quality at Scale by Gayathri Thiyagarajan
Measuring Data Quality with Clinical Architecture at ViVE2023
DAS Webinar: Data Quality Best Practices
Explanation Of Data Governance & Data Quality || Difference Between Data Governance & Data Q...
Metadata Management & Data Catalog (Data Architecture | Data Governance)
Improving Revit Data Quality and Control with Naviate Architecture
Комментарии