Support Questions

Find answers, ask questions, and share your expertise

YARN Log Aggregation with Spark2 Streaming - Large log files

Explorer

I'm running several Spark2 Streaming applications in a YARN cluster. I have yarn.log-aggregation-enable=true and the log files stored in HDFS grow unbounded. Before I was running this on YARN, using Spark Standalone, I used the spark.executor.logs.rolling.{interval, strategy, maxRetainedFiles} to mange log files and it worked great. I've tried all sorts of settings to keep the aggregate logs to a manageable size with no luck.

Can someone direct me to the configuration setting(s) that can help define how these aggregate logs are purged? An ideal scenario would allow me to manage them by time and size.

Thanks in advance.

8 REPLIES 8

Rising Star

The yarn log aggregation retention can be controlled by setting yarn.log-aggregation.retain-seconds property in yarn-site.xml

For example, if you want logs older than 30 days to be deleted, you can set yarn.log-aggregation.retain-seconds to 2592000

Explorer

Thanks for the response Tarun. I've tried that setting with no luck. I currently have it set to 600 as a test, only want to see logs for the last 10 minutes, and I have logs in there from yesterday. Is there a minimum I might be missing? I know the setting has a disclaimer that says not to set it to low it it will spam the node but it does not indicate a minimum threshold.

Does this work differently since it's a long-running (streaming) application and I'm technically using the same log file the entire time? The language in the description of this setting implies it deletes the file, in reality it needs to remove lines from a file that is being written to.

Rising Star

The retain-seconds will not work for an active application that is writing files. It works by checking whether the last modified timestamp for the application log dir falls older than the retain-seconds. Since your streaming job writes logs continuously, the directory timestamp will never fall within 600 seconds. So your logs are not getting deleted because of this.

Also log aggregation in yarn doesn't work the same way as setting log rolling/retention like in log4j as you are expecting.

- Do update/configure log4j in spark application so that your executor logs gets rotated by interval/size.

- Update yarn.nodemanager.log-aggregation.debug-enabled=true & yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds=<roll interval>" in yarn and restart yarn service.

Explorer

Thanks Sandeep, I'm working on this now and will report back once I've got it all setup.

Explorer

Sandeep, thanks for the response. As suggested, I have the following configurations established for the executors:

spark.executor.logs.rolling.strategy time
spark.executor.logs.rolling.maxRetainedFiles 72
spark.executor.logs.rolling.time.interval {various settings}

I've tested both the hourly and minutely settings for the above time interval and both of those seem to work, as in they roll the executor logs as they should.

I've also set yarn.nodemanager.log-aggregation.debug-enabled=true & yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds={various settings} and restarted YARN. Sadly, I'm not seeing the aggregate logs respect any of the settings, they just continue to grow and grow.

Any other tips?

@Andrew Mills What is the value you have set for yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds={various settings}? RM should aggregate the logs at this interval. Also assuming that your log4j is working as expected and logs are bring rolled.

New Contributor

hey ,guys I also meet this problem but I found my old apps (hive and hdfs ) already finished ,can't trigger clean function to clean my old app logs,and open debug mode also can't find yarn.log-retain started or skipped logs,so any idea to trigger this?

thank you

,

but I watch my hive app logs and hdfs revelant app logs and I found it didn't be cleaned by yarn and permission is ok and last modified timestamp also satisfied with retain-seconds,and I use debug mode and didn't found revelant logs show start clean or clean skipped

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.