we have spark production cluster with YARN service ( based on HDP 2.6.5 version )
total node-managers services are - 745 ( actually 745 Linux machines )
and yarn active resource-manager and standby resourcemanager are installed on different masters machines
we found that the following parameters not defined in our YARN configuration ( yarn-site.xml ) !
yarn.scheduler.increment-allocation-vcores
yarn.scheduler.increment-allocation-mb
and above parameters not defined not in Ambari and not in YARN XML configuration files!
I want to know what the meaning of the parameter - yarn.scheduler.increment-allocation-vcores ?
and what is the affect if this parameters are not defined in our configuration?
from YARN best practice configuration we are understanding that both parameters are part of YARN configuration , but we not sure if we must to add them to YARN custom configuration
from documentation we found:
Minimum and maximum allocation unit in YARN
- Two resources—memory and CPU, as of in Hadoop 2.5.1, have minimum and maximum allocation unit in YARN, as set by the configurations in yarn-site.xml.
- Basically, it means RM can only allocate memory to containers in increments of “yarn.scheduler.minimum-allocation-mb” and not exceed “yarn.scheduler.maximum-allocation-mb”
- It can only allocate CPU vcores to containers in increments of “yarn.scheduler.minimum-allocation-vcores” and not exceed “yarn.scheduler.maximum-allocation-vcores”
- If changes required, set above configurations in yarn-site.xml on RM nodes, and restart RM.
reference:
https://docs.trifacta.com/display/r076/Tune+Cluster+Performance
https://stackoverflow.com/questions/58522138/how-to-control-yarn-container-allocation-increment-prop...
https://pratikbarjatya.github.io/learning/best-practices-for-yarn-resource-management/
https://stackoverflow.com/questions/58522138/how-to-control-yarn-container-allocation-increment-prop...
Michael-Bronson