Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Multiple WASB "volumes" on a single Azure cluster?

Explorer

Hi,

I have a use case where an HDP cluster on Azure is used to dev and test. Ideally, we would like to separate the dev and test data in 2 different WASB storage accounts. Is there a way to define multiple account and keys in core-site.xml? And how would it map on the file system? Would it simply be wasb://mybucket[1-2]?

Thanks!

1 ACCEPTED SOLUTION

@Alex Gauthier You could use DASH but it is just a workaround to overcome the limitations (eg. IOPS) of WASB. Have you checked Azure Data Lake Storage as an option? It does not have such limits, and HDP supports it. The only prerequisite is the creation of a Data Lake storage account.

As a side note, if you would like to automate cluster mgmt. on Azure, Cloudbreak supports WASB (with DASH as well) and now ADLS as well. It automates both the provisioning and configuration steps.

View solution in original post

3 REPLIES 3

Explorer

Quick update, DASH is a package available from MSFT that allows "sharding" accross multiple accounts:

https://github.com/MicrosoftDX/Dash/tree/master/DashServer

@Alex Gauthier You could use DASH but it is just a workaround to overcome the limitations (eg. IOPS) of WASB. Have you checked Azure Data Lake Storage as an option? It does not have such limits, and HDP supports it. The only prerequisite is the creation of a Data Lake storage account.

As a side note, if you would like to automate cluster mgmt. on Azure, Cloudbreak supports WASB (with DASH as well) and now ADLS as well. It automates both the provisioning and configuration steps.

Explorer

Thanks man, It wasn't clear if CB 1.6 could supports ADLS. A better option than WASB shards with DASH for sure..