I would like to know how to allocate resource among CDH services.
I'm aware that I can allocate static resource pool among HDFS, HBase, Impala, YARN. I can also setup dynamic resource pool among services on YARN.
My question is :
1. Does the percentage of static resouce pool mean the minimum requirement? However, there's a warning sign whose hint 'total must add to 100%'. I would like some scheduler that will allocate minimum to each service. total will be less 100%, the unallocated resource will be share allocated dynamically if anyone need.
2. Plan1, isolate Impala and YARN. Plan2, Impala on YARN through LIama. Which plan is the Cloudera recommend way?
3. How to decide how much resource is needed to allocate static resouce pool? How to caculate how much resource does HDFS need. Is there any formula? For example, how many MB memory per GB data or per block in HDFS?