#60.Azure Data Factory - Incremental Data Load using Lookup\Conditional Split

preview_player
Показать описание
Incremental data load is a very important and widely used concept. We should build a solution that captures updates to the existing records and presence of new records and handle the scenarios accordingly. In ADF, we can perform the incremental load very easily by using lookup and conditional split transformations. Please watch thru to learn more.
Рекомендации по теме
Комментарии
Автор

Does this solution is applicable for source DB with millions of records? The reason for the ask is, how does this hash comparison will work in the case of millions of records. Will it have performance issues?

manoharraju
Автор

that was clearly explained... however it would have been even useful if you could have actually dragged the components and set up the whole thing manually.

vivekkarumudi
Автор

Can you please tell the query you wrote to create hash column.

When i tried i got same values for all the rows in hash column

varunvengli
Автор

look up has a limit for 5k rows right? how to deal with the input has 1Million rows?

robinson
Автор

Hi Madam, Video is good. I have few doubts.
1) But want to know why not used watermark table? It is not having full history load which is SCD-2. As per your approach, it may affect performance which we comparing all recs from target.
2) what are all the activities are you used. difficult to find in video because of changed names of those activities. could you list me the activities along with this?

Thanks madm.

sethuramalingami
Автор

Hi Mam

Please respond urgent query.

I have time in CSV file so how to convert time in Data Factory into Time. As I don't have date. I need to convert CSV time field into time format.

hackifysecretsau
Автор

Hi madum, how we can convert the different date formats into one date format.
For example 'yy/mm/dd' (or) 'dd/mm/yyyy' into 'yyyy-MM-dd' date format.

We can implement this in azure dataflow

prashanthn
Автор

Hello Mam, I need some suggestions. I need to make incremental data extraction pipeline in ADF. ServiceNow is my source and I am extracting data in json format and storing into blob storage. I need to extract only the latest updated or inserted data from ServiceNow.

kapilganweer
Автор

Hi, I need to copy the data from 5 tables in Azure data lake to 1 table in Cosmos DB. we need a particular field based on the relationships. Thanks in advance

mohanvp
Автор

Please let me know why lookup needed, any how we have conditionalsplit right?

ADFTrainer
Автор

Let me know if my understanding is incorrect, but isn't this similar to the upsert operation and cant this be achieved using the alter row-->upsert option as before? Also this looks structurally same as SCD component's output in ssis!

SaurabRao
Автор

Hello Ma'am,

I have a problem with the incremental load I want to create an incremental pipeline from the Oracle on-premise server to Azure data lake(blob storage) I don't have Azure SQL. I just want to push in blob storage as a CSV file. in my case, I have confusion about where I should create the watermark table. someone told me in your case you have to use parquet data. please help me with this I am stuck for many days.

souranwaris
Автор

Hi Mam, We don't have date column in the source side.Can we also implement the same process?

palivelaanjaneyagupta
Автор

How can we identify if a record is deleted in source ? how do we capture that in target ?,

jeffrypaulson
Автор

ma'am what is the difference between switch activity and if condition in ADF. Please reply

sancharighosh
Автор

You explain like school teacher. I really feel as if my class teacher is teaching me the concepts. Very thankful for your efforts Mam!!

shreeyashransubhe
Автор

Pls make this incremental load as dynamic, ,it wil help us a lot...

naveenkumar-ijmv
Автор

Isn't it the same as alter-row(upsert)? We can achieve the same right?

pawanreddie
Автор

Mam i have a doubt in the fault tolerance part in adf. I have configured adls gen2 storage account for writing the log where i'm getting this error.

"Azure Blob connect cant support this feature for hierarchical namespace enabled storage accounts, please use azure data lake gen2 linked service for this storage account instead".

The thing is i'm already using the azure data lake store gen2, but still receiving the error. Can you help in fixing this.

raghavendarsaikumar
Автор

Thanks for the sharing your knowledge. Could you do a video on How to delete target sql table rows which are not exist in source file. Tried through doesn't exist but giving a weird results. If in source 5 records missing exist in target sql table doesn't exist showing 30 records not sure why it is?. Thanks in advance

sumanyarlagadda