I am often tasked with automating tasks related to managing Azure Data Factory (ADF) pipelines. I am documenting below the PowerShell commands I frequently use to achieve the results I expect.
- Retrieve pipelines from a given factory
1$pipelines = Get-AzureRmDataFactoryPipeline -DataFactoryName MyFactory -ResourceGroupName MyResourceGroup
If you have multiple factories, you can append the results of another call in the same result variable
1$pipelines += Get-AzureRmDataFactoryPipeline -DataFactoryName MyOtherFactory -ResourceGroupName MyOtherResourceGroup - Similarly, there are commands to retrieve datasets and linked services
12$datasets = Get-AzureRMDataFactoryDataSet -DataFactoryName MyFactory -ResourceGroupName MyResourceGroup$linkedServices = Get-AzureRmDataFactoryLinkedService -DataFactoryName MyFactory -ResourceGroupName MyResourceGroup
The hierarchy of ADF pipelines goes as follow (there may be variations to this pattern that I am not aware):
Factory → Pipelines → Activities → DataSets → LinkedServices
Looping on the hierarchy:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
$pipelines | ForEach-Object { $pipelineName = $_.PipelineName $activities = $_.Properties.Activities $activities | ForEach-Object { $activityName = $_.Name $inputs = $_.Inputs $outputs = $_.Outputs $inputs | ForEach-Object { # and so on } } } |
Much of the information retrieved by these commands are in complex types. Most of them contains a Type property and a Properties property. The information from complex types can be flattened out using the following:
1 2 3 |
$summary = $datasets | select DataSetName, ResourceGroupName, ` @{Name="Type"; Expression={$_.Properties | Select-Object -ExpandProperty Type}}, ` @{Name="LinkedServiceName"; Expression={$_.Properties | Select-Object -ExpandProperty LinkedServiceName}} |
List unique type values:
1 |
$summary | select Type | Sort-Object -Property Type -Unique |
Group type values:
1 |
$summary | Group-Object Type |