Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

yarn local dirs + safety moving the local dir to Data mount points

avatar

hi all

We have ambari cluster with 134 DATANODE machines,

in YARN --> CONFIG

yarn.nodemanager.local-dirs 

configured as the following:

/var/hadoop/yarn/local,/grid/sdb/hadoop/yarn/local,/grid/sdc/hadoop/yarn/local,/grid/sdd/hadoop/yarn/local,/grid/sde/hadoop/yarn/local,/grid/sdf/hadoop/yarn/local

we want to remove the - /var/hadoop/yarn/local

from the configuration , and that required also YARN restart and maybe other services restart

we intend to do this action to avoid writing to the local disk ( /var )

since we have 143 data-node machines in our ambari cluster

we worry about this action of removing the line /var/hadoop/yarn/local from yarn.nodemanager.local-dirs ,

or maybe we can do it safety ?

we will happy to get hortonworks opinion

we know that generally not a good idea to use /hadoop/yarn/local for yarn.nodemanager.log-dirs which are container logs. Typically, we prefer to direct only these logs to all the Data mount points (like /grid/sdb/hadoop/yarn/local ).

Michael-Bronson
1 ACCEPTED SOLUTION

avatar

@Michael Bronson After configuration changes, it's safe to restart required services, those restart will make necessary new changes into the system. In our case, yarn.nodemanager.local-dirs will point out to new location /grid/sdb/hadoop/yarn/local instead of old location /var/hadoop/yarn/local . In short, restart will not cause any issue either after delete old files or after change in YARN configuration. I hope this answered your concerns.

View solution in original post

3 REPLIES 3

avatar

@Michael Bronson

Yes true, similarly it's not a good idea to use /var for yarn.nodemanager.local-dirs which are container local. Typically, you can direct these to all the data mount points (like /grid/sdb/hadoop/yarn/local). Same thing for yarn logs (/grid/sdb/hadoop/yarn/log) yarn.nodemanager.log-dirs. This can help with reducing all your IO going to your OS disk (where you typically have /var). You can take a look at http://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/

I hope that the above answers your questions. Please accept the answer you found most useful.

avatar

thank you , but one of my question was about to change the YARN configuration by delete the - /var/hadoop/yarn/local , from YARN , so after that as you know YARN required restart and we need to do it but will little worry if restart will failed because this changes , or maybe it is safe to do this change , what is your opinion ?

Michael-Bronson

avatar

@Michael Bronson After configuration changes, it's safe to restart required services, those restart will make necessary new changes into the system. In our case, yarn.nodemanager.local-dirs will point out to new location /grid/sdb/hadoop/yarn/local instead of old location /var/hadoop/yarn/local . In short, restart will not cause any issue either after delete old files or after change in YARN configuration. I hope this answered your concerns.