Created on 01-23-2016 11:30 PM
Node labels enable you partition a cluster into sub-clusters so that jobs can be run on nodes with specific characteristics. For example, you can use node labels to run memory-intensive jobs only on nodes with a larger amount of RAM. Node labels can be assigned to cluster nodes, and specified as exclusive or shareable. You can then associate node labels with capacity scheduler queues. Each node can have only one node label.
Demo:
Use case
2 node labels : node1 & node2 + Default & Spark queue
Submit job to node1
Node labels added : yarn rmadmin -addToClusterNodeLabels "node1(exclusive=true),node2(exclusive=false)"
Label assigned: yarn rmadmin -replaceLabelsOnNode "phdns02.cloud.hortonworks.com=node2 phdns01.cloud.hortonworks.com=node1"
Job Submission:
Job send to node1 only and assign to queue spark
hadoop jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command "sleep 100" -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar-queue spark -node_label_expression node1
Job send to node2 only and assign to queue spark
hadoop jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command "sleep 100" -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar-queue spark -node_label_expression node2
Job send to node1 only and assign to queue default
hadoop jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command "sleep 100" -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar-queue default -node_label_expression node1
More details - Doc link
Created on 01-25-2016 02:07 AM
After reading above, I'm just curious to know what would happen in the following scenario:
1) Create queues (ex: Rack1, Rack2, Rack3...)
2) Create (exclusive=true) Node Labels and assign to queues per my physical rack layout
3) Didn't set up HDFS rack-awareness (so that replication won't care about rack)
4) Submit a job to the queue "Rack1" but all blocks for this data are in DataNodes in different rack (ex: Rack2)
Would YARN AM try to create a remote container in a NameNode in Rack2? Or keep using container in Rack1 but fetch the data from a remote DataNode?
Created on 01-25-2016 01:11 PM
It will go to Rack1 and Step 3 ...I won't do that :)
User | Count |
---|---|
758 | |
379 | |
316 | |
309 | |
268 |