Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Multiple WASB "volumes" on a single Azure cluster?

avatar
Contributor

Hi,

I have a use case where an HDP cluster on Azure is used to dev and test. Ideally, we would like to separate the dev and test data in 2 different WASB storage accounts. Is there a way to define multiple account and keys in core-site.xml? And how would it map on the file system? Would it simply be wasb://mybucket[1-2]?

Thanks!

1 ACCEPTED SOLUTION

avatar

@Alex Gauthier You could use DASH but it is just a workaround to overcome the limitations (eg. IOPS) of WASB. Have you checked Azure Data Lake Storage as an option? It does not have such limits, and HDP supports it. The only prerequisite is the creation of a Data Lake storage account.

As a side note, if you would like to automate cluster mgmt. on Azure, Cloudbreak supports WASB (with DASH as well) and now ADLS as well. It automates both the provisioning and configuration steps.

View solution in original post

3 REPLIES 3

avatar
Contributor

Quick update, DASH is a package available from MSFT that allows "sharding" accross multiple accounts:

https://github.com/MicrosoftDX/Dash/tree/master/DashServer

avatar

@Alex Gauthier You could use DASH but it is just a workaround to overcome the limitations (eg. IOPS) of WASB. Have you checked Azure Data Lake Storage as an option? It does not have such limits, and HDP supports it. The only prerequisite is the creation of a Data Lake storage account.

As a side note, if you would like to automate cluster mgmt. on Azure, Cloudbreak supports WASB (with DASH as well) and now ADLS as well. It automates both the provisioning and configuration steps.

avatar
Contributor

Thanks man, It wasn't clear if CB 1.6 could supports ADLS. A better option than WASB shards with DASH for sure..