Created 12-30-2016 09:58 AM
My understanding along with questions as below,
AWS-HDCloud
Manual scaling using Ambari or AWS UI possible.
Auto Scaling
1. Is it possible to auto-scale in this option (while creating the cluster can i set auto-scaling group)?
1.1. In that case, how is the data re-balanced? i.e. if a new node is added, then compute may not gain data locality.
--------------------------------------------------------------------------------------------------------------------------------------------------------------
AWS-HDP on IaaS
Manual scaling using Ambari is possible.
Auto Scaling-Without CloudBreak
2. Is it possible to auto-scale in this option (while creating the cluster can i set auto-scaling group)?
2.1. In that case, how is the data re-balanced? i.e. if a new node is added, then compute may not gain data locality.
Auto Scaling-WithCloudBreak
Auto-scaling may be possible, but question 2.1 applies here as well.
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Azure-HdInsights
Manual scaling using Ambari or Azure UI possible.
Auto Scaling
3. Is it possible to auto-scale in this option (while creating the cluster can i set auto-scaling group)?
3.1. In that case, how is the data re-balanced? i.e. if a new node is added, then compute may not gain data locality.
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Azure-HDP in MarketPlace
Manual scaling using Ambari or Azure UI possible.
Auto Scaling
4. Is it possible to auto-scale in this option (while creating the cluster can i set auto-scaling group)?
4.1. In that case, how is the data re-balanced? i.e. if a new node is added, then compute may not gain data locality.
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Azure-HDP on IaaS
Same questions as AWS-HDP on IaaS
Created 01-02-2017 04:15 PM
To state it most simply, auto-scaling is a capability of Cloudbreak only at this point in time. With Cloudbreak Periscope, you can define a scaling policy and apply it to any Alert on any Ambari Metric. Scaling granularity is at the Ambari host group level. This provides you the option to scale services or components only, not the whole cluster. Per your line of questioning above, if you use Cloudbreak to provision HDP on either Azure IaaS or AWS IaaS, you can use the auto-scaling capabilities it provides. Both Azure HDInsight (HDI) and Hortonworks Data Cloud for AWS (HDC) make it very easy to manually re-size your cluster through their respective consoles. Auto-scaling is not a feature of either offering at this point in time.
In regards to data re-balancing, neither HDI nor HDC need to be concerned with this, because they are both automatically configured to use Cloud Storage (currently ADLS and S3 respectively) - not HDFS. For HDP deployed on IaaS with Cloudbreak, auto-scaling may potentially perform a HDFS rebalance - but only after a Downscale operation. In order to keep a healthy HDFS during downscale, Cloudbreak always keeps the replication factor configured and makes sure that there is enough space on HDFS to rebalance data. During downscale, in order to minimize the rebalancing, replication, and HDFS storms, Cloudbreak checks block locations and computes the least costly operations.
Created 01-02-2017 04:15 PM
To state it most simply, auto-scaling is a capability of Cloudbreak only at this point in time. With Cloudbreak Periscope, you can define a scaling policy and apply it to any Alert on any Ambari Metric. Scaling granularity is at the Ambari host group level. This provides you the option to scale services or components only, not the whole cluster. Per your line of questioning above, if you use Cloudbreak to provision HDP on either Azure IaaS or AWS IaaS, you can use the auto-scaling capabilities it provides. Both Azure HDInsight (HDI) and Hortonworks Data Cloud for AWS (HDC) make it very easy to manually re-size your cluster through their respective consoles. Auto-scaling is not a feature of either offering at this point in time.
In regards to data re-balancing, neither HDI nor HDC need to be concerned with this, because they are both automatically configured to use Cloud Storage (currently ADLS and S3 respectively) - not HDFS. For HDP deployed on IaaS with Cloudbreak, auto-scaling may potentially perform a HDFS rebalance - but only after a Downscale operation. In order to keep a healthy HDFS during downscale, Cloudbreak always keeps the replication factor configured and makes sure that there is enough space on HDFS to rebalance data. During downscale, in order to minimize the rebalancing, replication, and HDFS storms, Cloudbreak checks block locations and computes the least costly operations.
Created 01-03-2017 05:49 PM
@learninghuman If this answer helps, please accept it. Otherwise, I'd be happy to answer any remaining questions you have.
Thanks! _Tom