Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

YARN Node Labels Not Working ?? Two Issues

YARN Node Labels Not Working ?? Two Issues

Contributor

Background: I have configured two node labels - HiCPU (exclusive=false) and GPU (exclusive=true). I have attached HiCPU to a queue named Engineering with 100% capacity. Label GPU is attached to a queue named Marketing with 100% capacity. No default label has been configured for either queue at the beginning of the test.

Issue 1:

When I run the following commands as the hdfs user, the command will only run on an unlabeled node, and if no unlabeled nodes are available, the job simply hangs:

yarn jar
/usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command "sleep 25" -jar
/usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -num_containers 3 -queue Engineering
ResourceRequest.setNodeLabelExpression HiCPU

or

yarn jar
/usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command "sleep 25" -jar
/usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -num_containers 3 -queue Marketing
ResourceRequest.setNodeLabelExpression GPU

However, if I set the node label as default, the commands *DO* execute on the appropriate machine, even without the ResourceRequest.setNodeLabelExpression attribute (as would be expected).

Bottom line - I can only get node labels to work for a YARN job if they are set as the default, which means non-labeled nodes are not available to that queue any longer for YARN jobs.

Issue 2:

Our documentation here:

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_yarn_resource_mgt/content/using_node_labe...

...states the following:

"...if you submit a MapReduce job to a queue that has a default node label expression, the default node label will be applied to the MapReduce job."

To test this, I executed the following command using a user who was default-queue-mapped to the Engineering queue:

yarn jar 
/usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 5 10

While the job did get assigned to the correct queue, in no instance could I get the MapReduce job to run on a labeled node. It would run on an unlabeled node if available, and if no unlabeled nodes were available, it would just hang.

Thus, unless I am missing something, it appears that the *only* way to get a functional use of node labels is to set them as a default for a queue, in which case YARN jobs assigned to that queue will ONLY run on the labeled nodes (including unlabeled YARN jobs). Furthermore, under no circumstance will a MapReduce job run on a labeled node, regardless of default node label settings.

If someone wants to take a look at my settings on my cluster and troubleshoot, let me know. Thanks!

@Neeraj

10 REPLIES 10

Re: YARN Node Labels Not Working ?? Two Issues

@jeden@hortonworks.com Looking forward to reproduce this.

Re: YARN Node Labels Not Working ?? Two Issues

Contributor

Sure! Let me know if you want a peek at my setup.

Re: YARN Node Labels Not Working ?? Two Issues

@jeden did you get an answer to this issue?

Re: YARN Node Labels Not Working ?? Two Issues

Contributor

Not really. I have a coworker going back to validate what I saw now. Assuming he sees the same things I did, we'll be checking to see if bugs have been filed, etc. Neeraj, anything to add on your part? @Mark Herring @Neeraj Sabharwal

Re: YARN Node Labels Not Working ?? Two Issues

@jeden I need to book time to work on this. Def want to test this

Re: YARN Node Labels Not Working ?? Two Issues

Re: YARN Node Labels Not Working ?? Two Issues

Contributor

@Neeraj Sabharwal

Interesting. So the command you used *did* allow you to set node labels on the fly. If I'm reading correctly, the main difference was the

ResourceRequest.setNodeLabelExpression nodeLabel

being replaced with

-node_label_expression node1

Is that right? I went back and checked the documentation, and saw that it now matches what you performed. So the core of my issue in particular was a bug in the docs. Thanks for following up!

Re: YARN Node Labels Not Working ?? Two Issues

@jeden

Please see this

ResourceRequest.setNodeLabelExpression(<node_label_expression>) -- sets the node label expression for individual resource requests. This will override the node label expression set inApplicationSubmissionContext.setNodeLabelExpression(<node_label_expression>).

I used the following commands.

hadoop jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command "sleep 100" -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar-queue spark -node_label_expression node1

Re: YARN Node Labels Not Working ?? Two Issues

Expert Contributor

Hi.. Adding one point to your discussion. If we don't assign some default partition in the cluster, queues are unable to run more jobs even the resources are available.

Initially, i configured four node labels with four nodes each and without default partition.In this case, when i submit a job in queue, queue ran only one job, another job which i submitted is in ACCEPTED state even though cluster resources are available to that queue.

Later, i configured four node labels with four nodes each and with default partition as two nodes. Now, i am able to run multiple jobs in a queue.

I came to know that it was a bug in Application Master reported in Jira.

https://issues.apache.org/jira/browse/YARN-3216

I thought it may help you in your further processes.

Don't have an account?
Coming from Hortonworks? Activate your account here