Created on 10-02-2015 08:01 PM - edited 09-16-2022 01:32 AM
Here are some key things that will help an HDInsight cluster manageable and perform better. The following best practices items should be noted.
One thing to note is that only Hadoop services can be stopped. VMs are not exposed and cannot be paused. If the goal is to reduce cost of a running environment, it's better to delete the cluster and recreate them when needed.
Created on 01-27-2016 10:55 PM
Use an HDInsight on Linux cluster to have control over VMs through Ambari. Take a look at the public preview of Azure Data Lake Store which gets you past the storage account throttling and total size limits. Because you have separation of storage and compute you can move and load data with tools outside the cluster (SSIS, AzCopy, ADF, etc.), even when the cluster doesn't currently exist. Multiple clusters of HDInsight plus Azure Data Lake Analytics can all access the same data at the same time.
Created on 02-02-2016 04:53 PM
Thank you for your Micorsoft contributions on HCC @Cindy Gross