In this post, we will see how to configure node labels on YARN.
Before we go for the configuration part, let’s understand what is node label in YARN.
Node labels allows us to divide our cluster in different parts and we can use those parts individually as per our requirements. More specifically, we can create a group of node-managers using node labels, for example group of node managers which are having high amount of RAM and use them to process only critical production jobs! This is cool, isn’t it? So lets see how we can configure node labels on YARN.
Types of node labels:
Exclusive – In this type of node labels, only associated/mapped queues can access the resources of node label.
Non Exclusive(sharable) – If resources are not in use for this node label then it can be shared with other running applications in a cluster.
Configuring node labels:
Step 1: Create required directory structure on HDFS
Note – You can run below commands from any of the hdfs client.
You can verify if node labels have been created by looking at Resource manager UI under ‘Node Lables’ option in the left pane or you can also run below command on any of the Yarn client
yarn cluster --list-node-labels
[yarn@prodnode1 ~]$ yarn cluster --list-node-labels
16/12/14 15:45:56 INFO impl.TimelineClientImpl: Timeline service address: http://prodnode3.openstacklocal:8188/ws/v1/timeline/
16/12/14 15:45:56 INFO client.RMProxy: Connecting to ResourceManager at prodnode3.openstacklocal/172.26.74.211:8050
Node Labels: <x:exclusivity=true>,<y:exclusivity=false>
Step 5: Allocate node labels to the node managers using below command:
Note – Don’t worry about port if you have only one node manager running per host.
Step 6: Map node labels to the queues:
I have created 2 queues ‘a’ and ‘b’ in such a way that, queue ‘a’ can access nodes with label ‘x’ and ‘y’ where queue ‘b’ can only access the nodes with label ‘y’. By default, all the queues can access nodes with ‘default’ label.