Created on 02-20-2017 04:53 PM - edited 09-16-2022 04:07 AM
Hi, We need to find a way to maintain and search logs for the Long running Sprk streaming jobs on YARN. We have Log aggregation disabled in our cluster. We are thinking about Solr/Elastic search and may be Flume or Kafka to read the Sprk job logs.
any suggestions on how to implement search the on these logs and easily manage them?
Thanks,
Suri
Created 02-21-2017 02:02 PM
Created 04-14-2017 10:38 AM
It's true that you can aggreate logs to hdfs when the job is still running, however, the minimun log uploading interval (yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds) you can set is 3600 seconds which is 1 hour. The design is trying to protect namenode from being spamed.
You may have to use an external service to do the log aggregation. Either write your own or find other tools.
Below is the proof from yarn-default.xml in hadoop-common source code (cdh5-2.6.0_5.7.1).
<property>
<description>Defines how often NMs wake up to upload log files.
The default value is -1. By default, the logs will be uploaded when
the application is finished. By setting this configure, logs can be uploaded
periodically when the application is running. The minimum rolling-interval-seconds
can be set is 3600.
</description>
<name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name>
<value>-1</value>
</property>
Created 02-21-2017 12:26 AM
Created 02-21-2017 07:26 AM
@mbigelow You are right. We turned it off because of the long runnig jobs.
Do you know any other ways to implement log serach other than Solr/elastic?
Suri
Created 02-21-2017 08:13 AM
Created 02-21-2017 08:33 AM
We eant to searh for key phrases and at the same time we want developers to look in to the raw logs too for their troubleshooting and alerts for specific errors.
Created 02-21-2017 11:32 AM
Created 02-21-2017 01:15 PM
The documentation for YARN log aggregation says that logs are aggregated after an application completes.
Streaming jobs run for a much longer duration and potentially don't ever terminate. I want to get the logs into HDFS for my streaming jobs before the application completes or terminates. What are the better ways to do it, since Log aggregation only do it after the jobs are completed.
Suri
Created 02-21-2017 01:21 PM
Created 02-21-2017 01:46 PM
Thanks, @mbigelow.
So, if I set yarn.log-aggregation.retain-check-interval-seconds to 60 Seconds, It will send the logs to HDFS (every 60 seconds) even when the job was not finished? (Since streaming jobs run forever)
Suri
Created 02-21-2017 01:50 PM