In short, to configure authentication with ADLS using the client credential, you must register a new application with Active Directory service and then give your application access to your ADL account. After you've performed these steps, you can configure your core-site.xml.
Note for Cloudbreak users: When you create a cluster with Cloudbreak, you can configure authentication with ADLS on the "Add File System" page of the create cluster wizard and then you must perform an additional step as described in Cloudbreak documentation. If you do this, you do not need to perform the steps below. If you have already created a cluster with Cloudbreak but did not perform ADLS configuration on the "Add File System" page of the create cluster wizard, follow the steps below:
1. To use ADLS storage, you must have a subscription for Data Lake Storage.
2. To access ADLS data in HDP, you must have an HDP version that supports that. I am using HDP 2.6.1, which supports connecting to ADLS using the ADL connector.
2. Navigate to your
Active Directory and then select App Registrations:
3. Create a new web application by clicking on
+New application registration.
4. Specify an application name, type (Web app/API), and sign-on URLs.
Remember the application name: you will later add it to your ADLS account as an authorized user:
5. Once an application is created, navigate to the application configuration and find the Keys in the application's settings:
6. Create a key by entering key description, selecting a key duration, and then clicking
Save. Make sure to copy and save the key value. You won't be able to retrieve it after you leave the page.
7. Write down the properties that you will need to authenticate:
Step 2: Assign permissions to your application
1.Log in to the Azure Portal.
2.If you don't have an ADL account, create one:
3.Navigate to your ADL account and then select Access Control (IAM):
4.Click on +Add to add to add role-based permissions.
5.Under Role select the "Owner". Under Select, select your application. This will grant the "Owner" role for this ADL account to your application.
Note: If you are not able to assign the "Owner" role, you can set fine-grained RWX ACL permissions for your application, allowing it access to the files and folders of your ADLS account, as documented here.
Note: If using a corporate Azure account, you may be unable to perform the role assignment step. In this case, contact your Azure admin to perform this step for you.
Step 3: Configure core-site.xml
1.Add the following four properties to your core-site.xml.
While "fs.adl.oauth2.access.token.provider.type" must be set to “ClientCredential” you can obtain the remaining three parameters from step 7 above.
To make sure that the authentication works, try accessing data. To test access, SSH to any cluster node, switch to the hdfs user by using sudo su hdfs and then try accessing your data. The URL structure is: