17. Slowly Changing Dimension(SCD) Type 2 Using Mapping Data Flow in Azure Data Factory

preview_player
Показать описание
In this video, I discussed about SCD type 2 implementation using mapping data flow in Azure Data Factory

Link for Azure Databricks Play list:

Link for Azure Functions Play list:

Link for Azure Basics Play list:

Link for Azure Data factory Play list:

Link for Azure Data Factory Real time Scenarios

Link for Azure LogicApps playlist

#Azure #ADF #AzureDataFactory
Рекомендации по теме
Комментарии
Автор

Great work Maheer, couple of observations
1. Type 2 dimension needs EffectiveStartDate / EffectiveEndDate too. If we add these columns updating all history rows will always reset these dates which fails type2 idea. Also, not good for performance, as we are always updating all history rows be it millions.
2. During 1st execution, we should have a capability of verifying although source has an entry for EmpId=1001 but is it really updated coz only in that case it make sense to INSERT and UPDATE history rows else we are simply duplicating rows with no changes.

amitpundir
Автор

Best way to implement SCD Type 2 😀👍 very well explained

shaksrini
Автор

great explanation... explained in very easy way to understand the concept

dvsrikanth
Автор

Nice technique, great job! One small nitpick ... I'd prefer if you used true() instead of 1==1 for your Alter Row Update policy :)

kromerm
Автор

Nice and Superb Explanation. Thanks alot Maheer.

MohammedKhan-npdn
Автор

Good one maheer along with add duplicate records form source and make some columns as scd type 1 and some as scd 2 for same table and also incremental load as new session

Ravi-guww
Автор

Good explanation. But I guess you forgot to add a check if there is any change in any one of the column coming from the source file. Because you'll update the row in target only if you find any change in the source and destination.

bhosdiwalechacha
Автор

This is really good video and helpful too just one suggestion can you add record_create_date and record_expire_date and then upload ..It would be great..

himanshutrivedi
Автор

Nice job. Please keep them coming. How About a video on SCD type 4 implementations

kenpoken
Автор

Great work Maheer,
How to load parquet file from on-premises to Azure SQL database using Azure Data factory

raghavendrareddy
Автор

I have implemented as your explanation.. but i am facing an issue that, key column does not exist in sink...here is the screen shot.

ramubuddi
Автор

Hi Maheer, can we use Inner Join instead of lookup and filter ?

g.suresh
Автор

Create a branch from source use alter row to update the records in sink that are present in source and in the branch just use insert

mayank
Автор

In the update branch, instead of lookup and filter, we could replace it with join (inner)

dipanjanmukherjee
Автор

Good explaination. What is i have duplicate rows in the source file? How do i filter them?

pankajmandania
Автор

Can you make a video in which you can include Start date and End date, and dynamically the dates getting updated for type-2 scd. I see that is a necessity and many people face this issue.

rajpes
Автор

Could you please tell me how your pipeline behave if you do not change anything. In my case, it is inserting a new row with isrecent=1 and changing the previous value isrecent =0, but As I am not changing anything then it should not be inserted again.

martand
Автор

SCD type 2 was explained properly but one scinerio was not covered suppose we received same record from the source which is already present in the target. In that case also it will create new records and will update the old record as inactive under this logic.

theroy
Автор

In SSIS this is very very easy to accomplish, why is it still so cumbersome in ADF?

RobertLenior
Автор

Hi Maheer sir, in the case of SCD 2 type, We can use inner join transformation. As it will only take the common rows from CSV file and SQL table, based on the primary key column, so then no need to apply filter transformation. I mean to say that instead of lookup transformation and then filtering transformation we can directly use inner join transformation, to simply.
Am I right? can we do so?

mohanwarudkar