7 How to monitor Azure Data Factory (ADF) Logs using Monitor, log Analytics Workspace & Dashboard

preview_player
Показать описание
Data Factory Logs can be transported to "Log Analytics Workspace" using "Diagnostics setting" of Monitor. And then can be analyzed using "Azure Data Factory Analytics". And Dashboards can also be created. Please go through the video to get the complete steps.

Azure Data Factory logs→ Monitor (Diagnostics Settings) → Log Analytics Workspace → Azure Data Factory Analytics (Kusto Language) → Pin to Dashboard

Sample Query in Kusto:

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "ActivityRuns" and status_s != "Waiting" and status_s != "Running" and status_s != "InProgress" and status_s != "Queued"
| order by TimeGenerated desc
| extend SL=row_number()
| project SL, TimeGenerated, ActivityName=split(OperationName,"-")[0] ,status_s

Other options in Kusto Language:
split
Join
parse_json
substring
leftouter
toint

Sample Query in kusto for Metadata Driven pipeline:

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "ActivityRuns" and status_s != "InProgress" and status_s != "Queued"
| where pipelineName_s == "DMRefresh_BottomLevel" /// Please update the pipeline name
| where activityType_s == "Copy"

Below is the sample Query in kusto language for Metadata Driven pipeline where there are multiple activities in top, middle & bottom pipelines. It has become complex as I have joined 4 tables to get the "status" of "copy" step and status of "watermark update" step(last step) in one dashboard. Both copy step and watermark steps are in bottom level so only the last join would be sufficient. But I wanted to filter based on one type of trigger and trigger name is present in only top level. So I had to join top to middle and then middle to bottom and then bottom(copy step) to bottom(watermark update step).I have also parsed JSON to get the actual data. I have used leftouter, toint and other functionalities. When you join 2 tables with the same columns, then the column name changes automatically. For Example - status_s stays for the first table (before the join keyword) and status_s1 is the status for the 2nd table. I have joined 4 tables using 3 joins. So there are 4 status columns i.e status_s, status_s1, status_s2 & status_s3.

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "ActivityRuns" and status_s != "InProgress" and status_s != "Queued"
| where substring(Input_parameters_TriggerName_s, 0, 12) == "RefreshEvery" /////////I have multiple metadata driven pipelines so filtering on one type of trigger
| join (AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "ActivityRuns" and status_s != "InProgress" and status_s != "Queued"
| where pipelineName_s == "DMRefresh_MiddleLevel"///Need to change the name
| join (AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "ActivityRuns" and status_s != "InProgress" and status_s != "Queued"
| where pipelineName_s == "DMRefresh_BottomLevel"///Need to change the name
| extend JSON1=parse_json(Input_parameters_ObjectsPerGroupToCopy_s1)
| extend Table_Copy = substring(UserProperties_Destination_s2,8)
| extend Id_JSON_string = case(Table_Copy == parse_json(tostring(JSON1[0].SinkObjectSettings)).table, JSON1[0].Id,
Table_Copy == parse_json(tostring(JSON1[1].SinkObjectSettings)).table, JSON1[1].Id,
Table_Copy == parse_json(tostring(JSON1[2].SinkObjectSettings)).table, JSON1[2].Id,
Table_Copy == parse_json(tostring(JSON1[3].SinkObjectSettings)).table, JSON1[3].Id, "9999")
| extend Id_JSON = toint(Id_JSON_string)
| join kind=leftouter (AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "ActivityRuns" and status_s != "InProgress" and status_s != "Queued"
| where pipelineName_s == "DMRefresh_BottomLevel"///Need to change the name
| where activityName_s == "UpdateWatermarkColumnValue"
| extend Id_Watermark_step = toint(Input_storedProcedureParameters_Id_value_d)
,$left.Id_JSON == $right.Id_Watermark_step
| order by Output_rowsRead_d2 desc
| extend SL_Number=row_number()
| project SL_Number,TimeGenerated, TableName=substring(UserProperties_Destination_s2,8), Copy_Step_Status=status_s2, Update_Watermark_Status=status_s3,Output_rowsRead_d2,Output_rowsCopied_d2,Input_parameters_TriggerName_s,pipelineRunId_g2
Рекомендации по теме
visit shbcf.ru