Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

yarn local dirs + safety moving the local dir to Data mount points

avatar

hi all

We have ambari cluster with 134 DATANODE machines,

in YARN --> CONFIG

yarn.nodemanager.local-dirs 

configured as the following:

/var/hadoop/yarn/local,/grid/sdb/hadoop/yarn/local,/grid/sdc/hadoop/yarn/local,/grid/sdd/hadoop/yarn/local,/grid/sde/hadoop/yarn/local,/grid/sdf/hadoop/yarn/local

we want to remove the - /var/hadoop/yarn/local

from the configuration , and that required also YARN restart and maybe other services restart

we intend to do this action to avoid writing to the local disk ( /var )

since we have 143 data-node machines in our ambari cluster

we worry about this action of removing the line /var/hadoop/yarn/local from yarn.nodemanager.local-dirs ,

or maybe we can do it safety ?

we will happy to get hortonworks opinion

we know that generally not a good idea to use /hadoop/yarn/local for yarn.nodemanager.log-dirs which are container logs. Typically, we prefer to direct only these logs to all the Data mount points (like /grid/sdb/hadoop/yarn/local ).

Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Collaborator

@Michael Bronson After configuration changes, it's safe to restart required services, those restart will make necessary new changes into the system. In our case, yarn.nodemanager.local-dirs will point out to new location /grid/sdb/hadoop/yarn/local instead of old location /var/hadoop/yarn/local . In short, restart will not cause any issue either after delete old files or after change in YARN configuration. I hope this answered your concerns.

View solution in original post

3 REPLIES 3

avatar
Master Collaborator

@Michael Bronson

Yes true, similarly it's not a good idea to use /var for yarn.nodemanager.local-dirs which are container local. Typically, you can direct these to all the data mount points (like /grid/sdb/hadoop/yarn/local). Same thing for yarn logs (/grid/sdb/hadoop/yarn/log) yarn.nodemanager.log-dirs. This can help with reducing all your IO going to your OS disk (where you typically have /var). You can take a look at http://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/

I hope that the above answers your questions. Please accept the answer you found most useful.

avatar

thank you , but one of my question was about to change the YARN configuration by delete the - /var/hadoop/yarn/local , from YARN , so after that as you know YARN required restart and we need to do it but will little worry if restart will failed because this changes , or maybe it is safe to do this change , what is your opinion ?

Michael-Bronson

avatar
Master Collaborator

@Michael Bronson After configuration changes, it's safe to restart required services, those restart will make necessary new changes into the system. In our case, yarn.nodemanager.local-dirs will point out to new location /grid/sdb/hadoop/yarn/local instead of old location /var/hadoop/yarn/local . In short, restart will not cause any issue either after delete old files or after change in YARN configuration. I hope this answered your concerns.