End-to-end data validation strategies in Microsoft Fabric (+ 3 DEMOS)

preview_player
Показать описание

Data validation and data quality in general is probably the MOST IMPORTANT thing that need to get right when you're creating analytics solutions in Microsoft Fabric.

Without validated data, what use is that fancy machine learning model, or that fancy Power BI report?

In this video, we look in detail about the why, when and how of data validation in Microsoft Fabric.

I walk through three example notebooks to give you hands-on experience of how to implement at thre different stages in an end-to-end pipeline: incoming table, table data (Spark dataframes) and Power BI semantic models.

#powerbi #microsoftfabric #datavalidation #dataquality

Timeline
0:00 Intro
2:38 Why data validation is so important
3:56 What can go wrong in an analytics pipeline?
8:32 The prolem with Power BI
9:22 The huge opportunity in Fabric
11:45 Three types of data validation in Fabric
12:47 Schema validation overview
14:52 Schema validation with GX demo
23:05 Table/ Spark dataframe validation overview
24:38 DBT
25:33 Spark dataframe validation with GX demo
31:00 Semantic model validation overview
32:45 Semantic model validation wit GX demo
38:50 Wide review of tooling and approaches
43:25 Enterprise-scale data-quality monitoring strategy
46:53 Failure monitoring with Data Activator
48:20 Certifying valid datasets
49:19 Steps to embed data validation in your organisation
51:09 Final words
Рекомендации по теме
Комментарии
Автор

Medallion architecture ✅
Data Validation ✅
Well explained ✅

You should be an MVP

alexrook
Автор

Amazing work again!! I was looking for DQ options for a project and this video summed it all so well!! Thank you sharing the Notebooks.

vibhatt
Автор

Brilliant video. I liked how you used the "gx lite" approach at each medallion layer to build out the validation steps. This really makes the implementation a no brainer and removes some of the complexity.

rameshpaskarathas
Автор

It was insightful and provided a lot of valuable learning opportunities. Thank you for sharing your knowledge and expertise.Keep up the fantastic work!

gunax
Автор

The topic has been covered in a great detail .. thanks for creating such an elaborative video

SumeshKashyap
Автор

One of the best videos on the subject. Great job

marina
Автор

Thanks Will, it was informative and simple .

azizkatlane
Автор

Amazing content, really helpful on my Microsoft Fabric journey! Thank you!

GarethNel
Автор

Very helpful channel for learning Microsoft Fabric 🥇👍😊

anitatrpenoska
Автор

I made it that far, and many thanks again and again for this great contents.

azwarmzafar
Автор

Will, thank you for share your knowledgement!

sandrojorgeoliveira
Автор

✅ a whole new level to enterprise data quality.

robertdavies
Автор

your channel is the only source I use to learn Fabric. Thank you

farzadsaedi
Автор

Excelente contenido, he aprendido mucho el día de hoy ✅

codescarsoftware
Автор

✅ super helpful, Will, thanks for this.

timbojj
Автор

Excellent! William I would grateful if you expand a bit how treat great expectation results. I want struggling to create a database out of the JSON results

KwabenaOwusu-fgdn
Автор

Thanks for the detailed training Will. Quick question: are validation notebooks also placed in a data pipeline? if yes, when organizing the data pipeline, is it recommended that the validation notebooks are executed before the actual cleansing notebooks?

johnuzoma
Автор

Great Video! Even for a not so "deep techi" ;.-)

andreasratz
Автор

Where does data quality solutions like informatica fit in pls?

jaggyjut
Автор

Are you on twitter as well?

Great video

Rothbardo