I'm working on a 3 node cluster (HDP 2.6 installed). As I use one node as the Master node, where all the master services run on, its memory is needed for all these services. Because of that I want to avoid the resource allocation by YARN, as I don't want to run YARN containers on this node.
1. How can I avoid these allocation for that node?
2. Do I have to uninstall a service?
3. Is this a good approach to use one node as the master?
As my HDFS has two DataNodes installed (on the two slave nodes), I need the YARN containers there to calculate locally on these nodes. I don't know why I should also provide ressources from my Master node for the YARN service.
4. Can someone explain?
you can uninstall nodemanager from master node and colocate it with the datanodes, this way you can stop yarn to spin container on master node and since it is running on similar slave node where datanode is running so you need not to worry about the data locality.
Thank you for the very fast answer. So I will uninstall the NodeManager on my master node.
Did I understand you right: is the NodeManager not used if there is no DataNode installed on the same host?
And what would happen, if NodeManager and DataNode would be installed on different nodes?
Did I understand you right: is the NodeManager not used if there is no DataNode installed on the same host? this is not right understanding, NM will be used whether DN is on same host or not, as your installation has slave node where DN is running, if you install NM on these nodes then yarn will only spin container on these nodes, not the master node.