Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Explorer

It may often come to pass that you need to utilize data that does not exist in your Environment's Data Lake. This article will cover 2 scenarios where you need to access data that is outside of your Data Lake (either in a Storage Account in the same resource group as your CDP Public Cloud or in a different resource group, albeit in the same subscription).

UPDATE: Scenarios 3 and 4 below cover RAZ!

Scenario 1 - Adding External Storage Account Access to your Data Access Identity (non RAZ)

In this scenario, you may want a single managed identity to be able to access both your Data Lake Storage Account as well as your External Storage Account. For this, I will choose to grant access to only a single container within the storage account (but, as you will see in the following scenario, you could grant access to the entire storage account - I'm showing both purely for illustration).

  1. Find your DataAccessIdentity in the Azure Portal, and note it's Object IDScreen Shot 2021-07-15 at 9.16.36 AM.png
  2. Find your External Storage Account and note its name, resource group, and the container name you wish to grant access to.
  3. Pull up an Azure Cloud Shell and execute the following command (with the appropriate substitutions).

 

az role assignment create --assignee $DATAACCESS_OBJECTID --role 'ba92f5b4-2d11-453d-a403-e96b0029c9fe' --scope "/subscriptions/$SUBSCRIPTIONID/resourceGroups/$RESOURCEGROUPNAME/providers/Microsoft.Storage/storageAccounts/$STORAGEACCOUNTNAME/blobServices/default/containers/$CONTAINERNAME"

 

Note: 'ba92f5b4-2d11-453d-a403-e96b0029c9fe' is the GUID that maps to the built in Azure role "Storage Blob Data Contributor", which allows for Edit Access in the container. 
Now, when you view the role assignments for this managed identity, you should see a new entry for the external storage account.Screen Shot 2021-07-15 at 9.31.48 AM.pngNote: It may take several minutes until this page reflects your RBAC change.
Anyone who has an IDBroker mapping to this MSI can now access this new container in the external storage account.

Scenario 2 - Adding External Storage Account Access to a new Managed Identity (non RAZ)

In this scenario, we will create a brand new managed identity and provision access to the entire storage account. 

  1. Create a new MSI
    screen1upload.png
  2. Now, we'll use the Portal to add the role assignmentScreen Shot 2021-07-15 at 9.04.18 AM.pngNote: We used the Cloud Shell in Scenario 1 because the portal doesn't (yet) support scoping the role down to the container.
    After a few minutes, you should see the role assignment appearScreen Shot 2021-07-15 at 9.37.43 AM.png

You are now ready to map this MSI to users in CDP (via an IDBroker Mapping). Since this is a new role, let's quickly review how to do that...

  1. Head to the Properties blade for the new MSI and note the Resource IDscreen2upload.png
  2. Now let's head over to the CDP Console. Head to your Environment and click Actions and Manage Access.
    screen3upload.png
  3. Click IDBroker Mappings and click Edit.Screen Shot 2021-07-15 at 9.49.26 AM.png
  4. Click the plus sign to add a new mapping. Then start typing your user's name or Group and paste in the resource ID you noted in the Role field.
  5. Click Save and Sync and you're done!

If you followed Scenario 1 and have the DataAccessIdentity mapped to your user, you should now be able to access both the data container in the Data Lake Storage Account and our new container in the External Storage Account.

1docscreen.png

If you followed Scenario 2 and have the new MSI mapped to your user, you should now be able to ONLY access the new container in the External Storage Account.

2docscree.png

Scenario 3 - Adding External Storage Account in the same Subscription (RAZ)

In Azure, this can be accomplished by adding the same two roles (Storage Blob Data Owner and Storage Blob Delegator) you added to your RAZ Managed Identity for your Datalake Storage Account to an external Storage Account.

Here is what your RAZ Managed Identity looks like for your minimal setup for CDP with RAZ:

Screenshot 2023-07-28 at 1.37.13 PM.png

Just add these same two roles to another Storage Account to allow RAZ/your CDP Environment to interact with another Storage Account:

Screenshot 2023-07-28 at 1.40.12 PM.png

Screenshot 2023-07-28 at 1.40.33 PM.png

So that your Managed Identity Role Set looks like this:

Screenshot 2023-07-28 at 1.42.54 PM.png

Scenario 4 - Adding External Storage Account in a different Subscription (RAZ)

We follow the same procedure as above, but just with a different scope (because we're integrating with a storage account in a different subscription.

We add Storage Blob Data Owner and Storage Blob Delegator on the Storage Account (in a different subscription)

Screenshot 2023-07-28 at 1.43.16 PM.png

Screenshot 2023-07-28 at 1.43.36 PM.png

So that the RAZ Managed Identity has this role set for the scope of our "other" subscription:

Screenshot 2023-07-28 at 1.43.54 PM.png

Based on Scenarios 3 and 4, we now can interact with a total of 3 storage accounts:

  • perro4 (our Datalake storage account)
  • perro4ext (our external storage account in the same subscription)
  • perroext (our external storage account in a different subscription)

Screenshot 2023-07-28 at 1.46.38 PM.png

DISCLAIMER: This article is contributed by an external user. The steps may not be verified by Cloudera and may not be applicable for all use cases and may be very specific to a particular distribution. Please follow with caution and at your own risk. If needed, raise a support case to get confirmation.

1,310 Views