Support Questions
Find answers, ask questions, and share your expertise

Mapred Jobhistory server is unable to read aggregated log file from a busy nodemanager

New Contributor

I have been trying to figure the cause of missing logs on the jobhistory server for completed mapred job yarn containers. The problem only seems to arise when a single nodemanager is running many containers (12+); when running fewer containers this problem doesn't come up. The nodemanager has 13 data disks and that's where the yarn container local/log data get's stored. Log aggregation indicates it completes successfully and the aggregated log file looks correct (i was able to read it using LogAggregationIndexedFileController) but in the jobhistory server i see this message when trying to look at a completed map tasks logs:

2018-07-13 00:48:34,244 WARN webapp.View (IndexedFileAggregatedLogsBlock.java:render(139)) - Can not load log meta from the log file:hdfs://hadoopnn1.net:8020/app-logs/rana/logs-ifile/application_1530921118753_0010/hadoopdn4.net_45454

Current log aggregation settings are:

"yarn.log-aggregation.file-formats" : "IndexedFormat,TFile",
"yarn.nodemanager.log-aggregation.debug-enabled" : "false",
"yarn.log-aggregation.retain-seconds" : "2592000",
"yarn.nodemanager.log-aggregation.num-log-files-per-app" : "336",
"yarn.log-aggregation.file-controller.IndexedFormat.class" : "org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController",
"yarn.log-aggregation-enable" : "true",
"yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds" : "3600",
"yarn.log-aggregation.file-controller.TFile.class" : "org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.LogAggregationTFileController",
"yarn.nodemanager.log-aggregation.compression-type" : "gz",

Has anyone seen this before?

3 REPLIES 3

New Contributor

This is still a major problem for me. Can someone please help? Log aggregation seems broken if it cannot be used on a busy cluster with mapreduce.

Versions are :

HDP-2.6.4.0

HDFS	2.7.3

YARN	2.7.3

MapReduce2	2.7.3

Tez	0.7.0

Hive	1.2.1000

HBase	1.1.2

Pig	0.16.0

Oozie	4.2.0

ZooKeeper	3.4.6

Ambari Infra	0.1.0

Ambari Metrics	0.1.0

Ranger	0.7.0

Ranger KMS	0.7.0

Slider	0.92.0

New Contributor

Still the same case for me, I am using HDP-2.6.5.41

New Contributor

Still the same case for me, I am using HDP-2.6.5.41