Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Log managmement for Long-running Spark Streaming Jobs on YARN Cluster

avatar
Rising Star

Hi, We need to find a way to maintain and search logs for the Long running Sprk streaming jobs on YARN. We have Log aggregation disabled in our cluster. We are thinking about Solr/Elastic search and may be Flume or Kafka to read the Sprk job logs.

 

any suggestions on how to implement search the on these logs and easily manage them?

 

 

Thanks,

Suri

2 ACCEPTED SOLUTIONS

avatar
Champion
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
New Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
13 REPLIES 13

avatar
Rising Star

@mbigelow but from some other sources they said "set the yarn.log-aggregation.retain-check-interval-seconds to specify how often the log retention check should be run. By default, it is one-tenth of the log retention time" - What I understood from this was, it will only check for the retenstion and may not aggregate the logs based on that interval. Did I understood it correct?

 

Suri

avatar
Champion
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star
Thank you, I Will try it out.

avatar
New Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login