Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

how to turn off YARN Nodemanager container logs auto delete once finished

avatar
Rising Star

hi, how can I stop my container logs on the nodemanagers deleted once the job finished? they seems be deleted once the job are finished, they exist it here /data/yarn/container-logs when running.

 

Screen Shot 2020-06-08 at 2.27.09 PM.png

 

1 ACCEPTED SOLUTION

avatar
Moderator

Hello @Mondi ,

 

thank you for raising your question about why application logs being deleted from the nodes after the applications finished running and why is it happening, how to keep them in place.

 

When log aggregation is enabled with the 'yarn.log-aggregation-enable = true' [1] you will observe the behaviour described: after the logs are aggregated to HDFS, the logs are immediately deleted from the local file system. Log aggregation does not start until the application is finished. If you need to keep the logs and some other temporary files on the local node for troubleshooting and you have log aggregation turned on then you can use yarn.nodemanager.delete.debug-delay-sec . This is set to 0 seconds by default causing the immediate delete.

 

Should you disable log aggregation, non aggregated logs are kept for yarn.nodemanager.log.retain-seconds = 10800 seconds (3*3600 seconds or 3 hours). After that the NodeManager will delete the log files.

 

Please let us know if your enquiries been addressed!

 

Thank you:

Ferenc

 

[1] https://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

View solution in original post

1 REPLY 1

avatar
Moderator

Hello @Mondi ,

 

thank you for raising your question about why application logs being deleted from the nodes after the applications finished running and why is it happening, how to keep them in place.

 

When log aggregation is enabled with the 'yarn.log-aggregation-enable = true' [1] you will observe the behaviour described: after the logs are aggregated to HDFS, the logs are immediately deleted from the local file system. Log aggregation does not start until the application is finished. If you need to keep the logs and some other temporary files on the local node for troubleshooting and you have log aggregation turned on then you can use yarn.nodemanager.delete.debug-delay-sec . This is set to 0 seconds by default causing the immediate delete.

 

Should you disable log aggregation, non aggregated logs are kept for yarn.nodemanager.log.retain-seconds = 10800 seconds (3*3600 seconds or 3 hours). After that the NodeManager will delete the log files.

 

Please let us know if your enquiries been addressed!

 

Thank you:

Ferenc

 

[1] https://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community: