Created on 06-22-2015 03:28 PM - edited 09-16-2022 02:32 AM
Using Cloudera Manager I can set property "yarn.log-aggregation-enable" to "true".
I can then run "Deploy Client Configuration" from Cloudera Manager.
However, if I then run "hadoop classpath" or "yarn classpath", the Hadoop configuration directory, which is typically the first entry in the classpath, does not include an updated yarn-site.xml with "yarn.log-aggregation-enable" set to "true". Instead, it has the original yarn-site.xml which has no "yarn.log-aggregation-enable" property in it.
Typically the first entry in the classpath is "/etc/hadoop/conf" from "hadoop classpath" or "yarn classpath".
In contrast, if I run a YARN application which starts a Java task, I can print system property "java.class.path" and the first entry is a directory that does contain an updated yarn-site.xml with the property set with the value "true". For example, instead of "/etc/hadoop/conf" I see in one task the first directory is
"/var/run/cloudera-scm-agent/process/840-yarn-NODEMANAGER". And in fact there is an environment variable,
HADOOP_CONF_DIR, which points to the correct Hadoop config dir.
But this directory, the one in /var/run/cloudera-scm-agent, is not included in "hadoop classpath" or "yarn classpath".
In our application, we need to get the correct Hadoop config dir without running an YARN task. Even if I create a small Java program that prints environment variable HADOOP_CONF_DIR, or system property "java.class.path" and run it with "hadoop jar", I do not get the correct results.
How do I get the correct Hadoop configuration directory without running a YARN job?
Thanks
Created 06-29-2015 12:33 AM
No there is nothing that you can run to check if the log aggregation has finished. It is a distributed state only known inside the NM's
The only thing you can do is retry the log retrieval. Log aggregation is performed by the NodeManager(s) when the containers finish.
There is no possibility to tell how long that will take since one node could be running more than one container that finishes at almost the same time. The load on HDFS is also a factor: copying to HDFS will only be as fast as HDFS can handle it at that point.Wilfred
Created 06-22-2015 04:14 PM
Created 06-23-2015 08:34 AM
Thank you for the excellent clarification.
In our most typical use case, we submit a YARN application from a machine that is outside of the cluster. Thus the Client process runs on an external machine. Once the submitted YARN app completes our Client attempts to fetch the aggregated logs. Before attempting to fetch the aggregated logs, I had included a check to see if log aggregation was enabled simply to save time. I now believe that I should remove the check for log aggregation being enabled since the property is not a client-side property.
So the explanation answers my initial question. A related question: Is there a way for a Client process running on an external machine to check if log aggregation has been completed?
Thanks
Created 06-29-2015 12:33 AM
No there is nothing that you can run to check if the log aggregation has finished. It is a distributed state only known inside the NM's
The only thing you can do is retry the log retrieval. Log aggregation is performed by the NodeManager(s) when the containers finish.
There is no possibility to tell how long that will take since one node could be running more than one container that finishes at almost the same time. The load on HDFS is also a factor: copying to HDFS will only be as fast as HDFS can handle it at that point.Wilfred
Created 06-29-2015 08:56 AM
Thanks for the explanation. And thanks for tolerating me extending the original question.
This issue can be closed.