Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Log managmement for Long-running Spark Streaming Jobs on YARN Cluster

Solved Go to solution
Highlighted

Log managmement for Long-running Spark Streaming Jobs on YARN Cluster

Contributor

Hi, We need to find a way to maintain and search logs for the Long running Sprk streaming jobs on YARN. We have Log aggregation disabled in our cluster. We are thinking about Solr/Elastic search and may be Flume or Kafka to read the Sprk job logs.

any suggestions on how to implement search the on these logs and easily manage them?

Thanks,

Suri

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Log managmement for Long-running Spark Streaming Jobs on YARN Cluster

Contributor

You achieve it by setting appropriate value: in yarn-site.xml

yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds

Then yarn will aggreagate the logs for the running jobs too.

https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

Suri

1 REPLY 1

Re: Log managmement for Long-running Spark Streaming Jobs on YARN Cluster

Contributor

You achieve it by setting appropriate value: in yarn-site.xml

yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds

Then yarn will aggreagate the logs for the running jobs too.

https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

Suri

Don't have an account?
Coming from Hortonworks? Activate your account here