Data Profiling and Fuzzy Matching in Power Query | Power Week 10.18 Power BI Desktop October update

preview_player
Показать описание
In the new Power BI Desktop update from October 2018, the power bi team introduced data profiling in Power Query.

In this video I show the new functionality. :)

SUBSCRIBE to learn more about Power and Excel BI!

Our PLAYLISTS:

ABOUT CURBAL:

************

************

QUESTIONS? COMMENTS? SUGGESTIONS? You’ll find me here:
► Twitter: @curbalen, @ruthpozuelo
Рекомендации по теме
Комментарии
Автор

Hi Ruth. I only just realised that profiling is only based on the top 1000 rows (as per the msg at the bottom of the screen). This was confusing when looking at my 18K row table! I hope it can look at the whole thing soon.

davidcadman
Автор

Thank you for sharing Ruth,
When looking at the Data Profiling feature, I can not help but think about a similar feature on Azure ML that let you inspect each column and get informations about the quality of your data.

mehdihammadi
Автор

Crazy watching this video now seeing these features that i just met there as i just started using PowerBI this year being introduced. You were so excited😂💯 Thankful im not alone in being confused with distinct and unique

Yinusa
Автор

Fuzzy matching? Wow just like Fuzzy lookup in Excel. I have to test both solutions to find out which one performs better. I guess the fuzzy in PQ is going to win because Fuzzy Lookup in Excel (add in) has not been updated for a long time. Now i have to find time to do the comparison :)

pmsocho
Автор

Gracias Ruth, muy útil este ejemplo de las nuevas características de PowerBI!

NorbertoVeraReatigaNVR
Автор

This video is simply awesome!! Well explained the difference between Distinct and unique in data profiling. Thanks

AnalyzeIt_Nael
Автор

Great additions explained in great way

pratikfutane
Автор

Hey Ruth, awesome show.. can you name which tool you use to highlight different line colors with your pen. thanks!

dbookmarker
Автор

Wow these functions look pretty useful!

christopherhastings
Автор

Thank you Ruth for the video !
I am actually struggling with fuzzy matching in R and Python having performance issues on a 8000 customers table. Impatient to try PBI fuzzy matching and it's options. Are you sure threshold is from 0 to 1 ? May be it is from 0 to 100 ?

didierterrien
Автор

Unique would have to do with performance evaluation too I think, - I'm not sure where (maybe a blog by Wyn Hopkins but not sure) I read this but there was an example with a million rows or so - in scenario A they were populated with unique values and in scenario B - with identical. The file size after saving the changes was huge, i a few hundred MB if I remember correctly.

vitaliburla
Автор

Distinct = Filtered list; Unique = Not duplicate :)

sandip_bettereveryday
Автор

My option screen is different and does not have anywhere to enable fuzz into Option menu, , any clue or tip please?

InglesallYou
Автор

Great. You should change the title to Data Profiling & Fuzzy Matching. "Unique" is not a good name - single row, single use or single instance values (or something else would be better surely). Unrelated, but since updating I find that deleting measures (in the field list only) results in "something went wrong" - is it the same for you? Also, I see no difference in the DAX editor and, in fact, I have lost all intellisense there - do you see that too?

davidcadman
Автор

I do not expect to use "Fuzzy" very much, but on the other hand, I will use "Data profiling" all the time.

rassten
Автор

Uniqueness is common in Data Quality tools. Very handy if you've got a huge data set and want to know if a value is a Key or if you've got some bad data in a field that should be unique.

TheVamos