I have a small cluster of 3 nodes with 40Go of memory
I've successfully been able to UP scale a cluster using a slightly modified version of https://github.com/hortonworks/cloudbreak/blob/release-1.16/autoscale/src/main/resources/alerts/allo...
But I trigger an alert when the % of allocated memory > 95 %
And this works perfectly : my cluster scales up when required.
Now I want to do the ~reverse operation : scale down when 1 have one useless compute node :
I'm going to try to create an alert when there's ~40Go of remaining memory free + 5% of 120 Go to prevent oscillations ( scale up / down / up down... )
Do you think this is a good way to solve my problem ?
I think this is the correct approach for your problem. I suggest to config enough cooldown time between scaling events because the hdfs data movement can be slow If you have too much data.
Thanks a lot for your feedback !!
Yes we have currently set a long cooldown between scaling events and this seems ok.
However we are investigating the following solution to remove this constraint, I'll be glad to discuss with you, or anyone interested on the potential following proposal.
Separate HDFS on some nodes and "compute" on other nodes.
For example we start a cluster with :
- 3 hdfs nodes
- 2 or 3 compute nodes
And we use cloudbreak + ambari to create several alerts & scale policies :
- 1 to scale up compute nodes ( based on % of allocated memory )
- 1 to scale down compute nodes ( based on the amount of remaining memory )
- 1 to scale up HDFS ( based on hdfs usage )
- 1 to scale down HDFS ( based on hdfs usage too )
Here I believe that we have a cluster which really scales regarding our requirements.
I think you are on the right track. This is how we doing things in Hortonworks Data Cloud.
We have 3 type of nodes there: master, workers, computes and scaling the computes up and down. One thing which you should investigate that where you want to store your data because I probably suggest a cloud object store which is supported by Hortonworks so s3 (AWS) or adls (Azure).