Thursday, April 7, 2022

How to implement CI/CD in Azure Data Factory

A CI/CD pipeline is a series of steps that must be performed in order to deliver a new version of software. Continuous integration/continuous delivery (CI/CD) pipelines are a practice focused on improving software delivery using either a DevOps or site reliability engineering (SRE) approach.

These are just an example of common stages you may find. Your pipeline will be unique to the requirements of your organization.

In ADF, CI/CD means moving Data Factory pipelines from one environment to another such Dev –> QA 🡪 Prod.

Create a Resource Groups for different environment with similar resources.

In this example, ADF pipeline used Azure Storge Account & Key vault

Create a new Azure DevOps Project

Crate a new repository

Configure DEV-ADF-U repository to ADFRepo and publish a pipeline.

Make sure all these linked services are using DEV resources.

Once you publish, you can see new branch named called adf_publish branch automatically created in ADFRepo repository.

Inside the adf_publish branch, you can see the ARM template files. One for json file for pipeline and the other josn file for parameter.

Now let’s see how to move DEV data factory to UAT data factory

Under DevOps click on Pipelines 🡪 Releases 🡪 create Pipeline 🡪 click on Empty job

Set stage name as UAT

Rename Pipeline name as appropriate

Click on add an artifact

Configure the artifact values to the release pipeline as sown below.

You can add new stages such as PROD from UAT

Click on UAT Stage Job

Click on Add tasks 🡪 click + on Agent Job 🡪 Search the template using ARM

In this case ARM template deployment is used.

Configure the setting for the UAT environment

First Authorized the Azure Subscription, this will create a service principle for the pipeline, you can see that under App Registration in AAD

The configure the rest of the settings

Select Resource Group as RG-UAT

Select ARMTemplateForFactory.json as Template

Select ARMTemplateParametersForFactory.json as Template parameters


Overwrite the parameter from DEV to UAT

Once you configure all the values, it will be as show below

Click on Save button, this will successfully create a release pipeline for you.

To release to UAT

click on Releases 🡪 ADFReleasePipeline 🡪 click on Create release 🡪 click on create button again

Once you click on it, it will release everything to UAT

In my case, I haven’t purchased or requested the parallelism yet which is required for release, so I got error. You can request it free using a submitting form, it would take 2-3 days to approve.

##[Error 1]

No hosted parallelism has been purchased or granted. To request a free parallelism grant, please fill out the following form


Cheers!
Uma

Sunday, March 6, 2022

How to authenticate Azure Storage Account using Key Vault in Azure Data Factory

 How to authenticate Azure Storage Account using Key Vault in Azure Data Factory

Open the portal blade for your storage account 🡪 Select Access keys

To copy key values, you must first click Show keys 🡪 Copy the Connection string value for key1

Open key vault 🡪 Secrets 🡪 Generate/Import

Paste the contents of the clipboard into the Value field, then enter a Name for the secret and create.

Grant Access to the Key Vault

Your data factory cannot use the secrets stored in your key vault until you grant it permission to do so. The data factory instance has an associated managed identity – a managed application registered in Azure Active Directory – which was created automatically when you created the data factory. You must grant access to this identity.

key vault and select Access policies

On the Access policies blade, locate and click the + Add Access Policy button

select Get and List from the Secret permissions

Under Select principal, click None selected

This opens the security principal selection blade

At the top of the blade is a search input field. An ADF managed identity service principal has the same name as the ADF instance it represents – enter the name of your data factory to search for the service principal. The search will return one matching item, as shown in the below figure. Click the item to choose it, then click the Select button at the bottom of the blade.

Create a Key Vault ADF Linked Service

Azure Data Factory accesses a key vault in exactly the same way it does other types of external resource: using a linked service. To refer to a key vault from within your data factory, you must create a linked service to represent it.

add a new linked service

then search for and select the Azure Key Vault data store. Click Continue

On the New linked service (Azure Key Vault) blade, provide a Name for the key vault linked service, then select your key vault from the Azure key vault name dropdown

Use the Test connection button to check the linked service configuration, and when successful, click Create


Create a New Storage Account Linked Service

create another new linked service, this time using the Azure Blob Storage data store

Ensure that Authentication method is set to “Account key,” then use the toggle below that field to change the connection type from “Connection string” to “Azure Key Vault.”

Select your key vault linked service from the AKV linked service dropdown, then enter the name of your storage account connection string secret.

Use the Test connection button to check the linked service configuration, and when successful, click Save.

The new linked service obtains credentials from the key vault at runtime, by obtaining the value of your named secret, authorized using the ADF instance’s managed identity


Cheers!
Uma

Tuesday, March 1, 2022

How to set up code repository for Azure Data Factory using Azure DevOps

First, if you don’t have an Azure DevOps account signup for free. Make sure you select the Default directory.

Create an organization and a project.

Organization - UmaAzure01
Project – azure_adf

Create a new Repository – Imagine a Repository is like a Parent Folder

 

Now go to Azure Data Factory and click on Set up code repository.

Here Collaboration branch means – the Final Code of the application. Usually, the main branch is set as a collaboration branch which is called as Main/Master Branch.

You can create multiple branches which are called a Feature Branch.

Branches will be merged with the master branch through Pull Request. The pull request is for approval for merging with the master.

Create a new working branch or you can use the main. In here ad_dev_branch1

Let’s create a pipeline and test. Create a pipeline under adf_dev_branch1 and saved it.

You can log in to the repository and create the Pull Request to merge

adf_dev_branch1 🡪 main

Approve the Pull Request and click on the complete button.


Once you the is completed, you can see changes in Main Branch in ADF

Cheers!
Uma