02-21-2017 01:53 PM
@mbigelow but from some other sources they said "set the yarn.log-aggregation.retain-check-interval-seconds to specify how often the log retention check should be run. By default, it is one-tenth of the log retention time" - What I understood from this was, it will only check for the retenstion and may not aggregate the logs based on that interval. Did I understood it correct?
04-14-2017 10:38 AM
It's true that you can aggreate logs to hdfs when the job is still running, however, the minimun log uploading interval (yarn.nodemanager.log-aggregation.roll-monitoring-
You may have to use an external service to do the log aggregation. Either write your own or find other tools.
Below is the proof from yarn-default.xml in hadoop-common source code (cdh5-2.6.0_5.7.1).
<description>Defines how often NMs wake up to upload log files.
The default value is -1. By default, the logs will be uploaded when
the application is finished. By setting this configure, logs can be uploaded
periodically when the application is running. The minimum rolling-interval-seconds
can be set is 3600.