Support Questions

Find answers, ask questions, and share your expertise

Is it possible to reserve whole nodes for exclusive application usage

avatar
Contributor

I double that, but I check here anyway. Our system is on CDH-5.15.2. The resource manager and job scheduler is YARN:

$ yarn version

Hadoop 2.6.0-cdh5.15.2

Subversion http://github.com/cloudera/hadoop -r c97bcbf0cba923467d45f5519b1953f436c64f12

Compiled by jenkins on 2018-11-13T13:53Z

Compiled with protoc 2.5.0

From source with checksum 9d2d5b887383c7d4b811372f867c6440

This command was run using /opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/jars/hadoop-common-2.6.0-cdh5.15.2.jar

 

There are two users, they want isolated environment to run some experiments. I wonder if we can reserve two nodes for their use. 

 

Any suggestions and discussion are very welcome.

 

 

Best Regards,

Vincent

1 ACCEPTED SOLUTION

avatar
Expert Contributor

You may can achive the goal with YARN node label feature. See the detailed explaination on the following Hortonworks post:

 

https://community.hortonworks.com/articles/72450/node-labels-configuration-on-yarn.html

 

Current Cloudera CDH distribution does not officially support Node Label, we are working on release a uniform version of CDH + HDP. We will have a new Cloudera Data Platform (CDP) release later this year. If you are our subscruption customer, please feel free to contact cloudera support to enqury the state of this feature, also the state of the new CDP release.

 

Thanks.

View solution in original post

3 REPLIES 3

avatar
Expert Contributor

You may can achive the goal with YARN node label feature. See the detailed explaination on the following Hortonworks post:

 

https://community.hortonworks.com/articles/72450/node-labels-configuration-on-yarn.html

 

Current Cloudera CDH distribution does not officially support Node Label, we are working on release a uniform version of CDH + HDP. We will have a new Cloudera Data Platform (CDP) release later this year. If you are our subscruption customer, please feel free to contact cloudera support to enqury the state of this feature, also the state of the new CDP release.

 

Thanks.

avatar
Contributor

It's very nice to know the upcoming updates. So with the feature in place, we would be able to assign some nodes exclusively to certain users (e.g. user userA, application appX) for their processing requirements.

 

The question following naturally is about data storage locations: can other users' applications still read and write to the assigned nodes? will appX output files be written to the assigned nodes only, will appX be allowed to read input file blocks from all nodes in the cluster? This is too much I know. 

 

Thanks  a lot!

 

 

 

 

avatar
Mentor
Depends on what you mean by 'storage locations'.

If you mean "can other apps use HDFS?" then the answer is yes, as HDFS is an independent system unrelated to YARN and has its own access and control mechanisms not governed by a YARN scheduler.

If you mean "can other apps use the scratch space on NM nodes" then the answer is no, as only local containers get to use that.

If you're looking to strictly split both storage and compute, as opposed to just some form of compute, then it may be better to divide up the cluster entirely.