Created 03-10-2017 06:56 PM
Hi,
I have a use case where an HDP cluster on Azure is used to dev and test. Ideally, we would like to separate the dev and test data in 2 different WASB storage accounts. Is there a way to define multiple account and keys in core-site.xml? And how would it map on the file system? Would it simply be wasb://mybucket[1-2]?
Thanks!
Created 03-13-2017 10:43 AM
@Alex Gauthier You could use DASH but it is just a workaround to overcome the limitations (eg. IOPS) of WASB. Have you checked Azure Data Lake Storage as an option? It does not have such limits, and HDP supports it. The only prerequisite is the creation of a Data Lake storage account.
As a side note, if you would like to automate cluster mgmt. on Azure, Cloudbreak supports WASB (with DASH as well) and now ADLS as well. It automates both the provisioning and configuration steps.
Created 03-10-2017 07:51 PM
Quick update, DASH is a package available from MSFT that allows "sharding" accross multiple accounts:
Created 03-13-2017 10:43 AM
@Alex Gauthier You could use DASH but it is just a workaround to overcome the limitations (eg. IOPS) of WASB. Have you checked Azure Data Lake Storage as an option? It does not have such limits, and HDP supports it. The only prerequisite is the creation of a Data Lake storage account.
As a side note, if you would like to automate cluster mgmt. on Azure, Cloudbreak supports WASB (with DASH as well) and now ADLS as well. It automates both the provisioning and configuration steps.
Created 03-14-2017 05:31 AM
Thanks man, It wasn't clear if CB 1.6 could supports ADLS. A better option than WASB shards with DASH for sure..