Created on 06-02-2017 06:24 PM - edited 08-17-2019 12:41 PM
To control access, Azure uses Azure Active Directory (Azure AD), a multi-tenant cloud-based directory and identity management service. To learn more, refer to https://docs.microsoft.com/en-us/azure/active-directory/active-directory-whatis.
In short, to configure authentication with ADLS using the client credential, you must register a new application with Active Directory service and then give your application access to your ADL account. After you've performed these steps, you can configure your core-site.xml.
Note for Cloudbreak users: When you create a cluster with Cloudbreak, you can configure authentication with ADLS on the "Add File System" page of the create cluster wizard and then you must perform an additional step as described in Cloudbreak documentation. If you do this, you do not need to perform the steps below. If you have already created a cluster with Cloudbreak but did not perform ADLS configuration on the "Add File System" page of the create cluster wizard, follow the steps below:
1. To use ADLS storage, you must have a subscription for Data Lake Storage.
2. To access ADLS data in HDP, you must have an HDP version that supports that. I am using HDP 2.6.1, which supports connecting to ADLS using the ADL connector.
1. Log in to the Azure Portal at https://portal.azure.com/.
2. Navigate to your Active Directory and then select App Registrations:
3. Create a new web application by clicking on +New application registration.
4. Specify an application name, type (Web app/API), and sign-on URLs.
Remember the application name: you will later add it to your ADLS account as an authorized user:
5. Once an application is created, navigate to the application configuration and find the Keys in the application's settings:
6. Create a key by entering key description, selecting a key duration, and then clicking Save. Make sure to copy and save the key value. You won't be able to retrieve it after you leave the page.
7. Write down the properties that you will need to authenticate:
1.Log in to the Azure Portal.
2.If you don't have an ADL account, create one:
3.Navigate to your ADL account and then select Access Control (IAM):
4.Click on +Add to add to add role-based permissions.
5.Under Role select the "Owner". Under Select, select your application. This will grant the "Owner" role for this ADL account to your application.
Note: If you are not able to assign the "Owner" role, you can set fine-grained RWX ACL permissions for your application, allowing it access to the files and folders of your ADLS account, as documented here.
Note: If using a corporate Azure account, you may be unable to perform the role assignment step. In this case, contact your Azure admin to perform this step for you.
1.Add the following four properties to your core-site.xml.
While "fs.adl.oauth2.access.token.provider.type" must be set to “ClientCredential” you can obtain the remaining three parameters from step 7 above.
<property> <name>fs.adl.oauth2.access.token.provider.type</name> <value>ClientCredential</value></property> <property> <name>fs.adl.oauth2.client.id</name> <value>APPLICATION-ID</value></property> <property> <name>fs.adl.oauth2.credential</name> <value>KEY</value></property> <property> <name>fs.adl.oauth2.refresh.url</name> <value>TOKEN-ENDPOINT</value> </property>
2. (Optional) It's recommended that you protect your credentials with credential providers. For instructions, refer to https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_cloud-data-access/content/adls-protectin....
To make sure that the authentication works, try accessing data. To test access, SSH to any cluster node, switch to the hdfs user by using
sudo su hdfs and then try accessing your data. The URL structure is:
For example, to access "testfile" located in a directory called "testdir", stored in a data lake store called "mytest", the URL is:
The following FileSystem shell commands demonstrate access to a data lake store named mytest:
hadoop fs -ls adl://mytest.azuredatalakestore.net/ hadoop fs -mkdir adl://mytest.azuredatalakestore.net/testDir hadoop fs -put testFile adl://mytest.azuredatalakestore.net/testDir/testFile hadoop fs -cat adl://mytest.azuredatalakestore.net/testDir/testFiletest file content
For more information about working with ADLS, refer to Getting Started with ADLS in Hortonworks documentation.