Created 04-05-2018 10:31 PM
I'm trying to view aggregated logs - everything seems to be set up correctly, and it appears logs are aggregating without issue. However, they're.
I have yarn.log-aggregation-enable set to true and yarn.nodemanager.remote-app-log-dir set to /tmp
/tmp is owned by hdfs and is accessible to the hadoop group
I have yarn.log-aggregation.retain-seconds is set to 2419200.
Once the job finished I don't see the directory /tmp/logs (or /tmp/{user} or any other new directory in /tmp) in hdfs. I am unable to view the job logs through Yarn UI.
Running yarn logs -applicationId {application_id} returns no results.
When I check on the job status in the Yarn UI, it shows Log Aggregation Status: Succeeded.
Resource manager logs are available and show no errors.
Any ideas as to where these logs are disappearing to?
We're running hadoop 2.5.3
Created 04-05-2018 11:13 PM
Can you change ownership of /tmp in HDFS to yarn:hadoop instead of hdfs:hadoop? Is it a secure cluster?
Created 04-06-2018 03:30 PM
@Gour Saha I've set yarn.nodemanager.remote-app-log-dir=/app-logs and then set the owner of /app-logs to yarn:hadoop and still the problem remains (obviously now I'm checking /app-logs instead of /tmp for log files)
The cluster is not secure - Kerberos is disabled.
Created 04-07-2018 06:24 AM
Can you check the NM log in the host where at least one container of your job ran to see if you find any errors related to log-aggregation?
Created 04-10-2018 03:54 PM
Pulling the logs from the Node manager... No errors of any kind
Application finishes, it says it's uploading the logs for each container, and then moves on to the next thing.
Only thing that may be relevant is Current good log dirs are /mnt/resource/hadoop/yarn/log which is I think referring to the local log directory.
There is also (Slf4jLog.java:info(67)) - Aliases are enabled
Nothing else of note.
Created 04-10-2018 08:25 PM
What kind of apps are you running? Can you check under the {application_id}/{container_id} directories under yarn.nodemanager.log-dirs if the apps are creating any logs? Are the apps creating a sub-directory under the container dir and then logging under that? Note, logs created under sub-directories are not aggregated.
Created 01-23-2019 01:23 PM
I have a very similar issue. After integratin the entire cluster, all services including Ranger, with the Active Directory, the YARN application log folders for Active Directory users are empty.
This does not happen when the application is run as the hdfs user, so it has to be a permissions problem.
While an application is running I can see the log folders and files on the node running the container for the application:
/opt/hadoop/yarn/log/application_1548066747260_2576 /opt/hadoop/yarn/log/application_1548066747260_2576/container_e71_1548066747260_2576_01_000001 /opt/hadoop/yarn/log/application_1548066747260_2576/container_e71_1548066747260_2576_01_000001/launch_container.sh /opt/hadoop/yarn/log/application_1548066747260_2576/container_e71_1548066747260_2576_01_000001/directory.info
As soon as the application stops the log folder is created in hdfs, but it is empty.
The Ranger policies have been set up to give service accounts (e.g. hdfs, hive, spark etc.) full access to the entire hdfs file structure.
AD users who are allowed to submit spark jobs are given access to these folders in an attempt to allow application logs to work:
Any suggestions will be greatly appreciated.
I can post more configuration and settings information if needed.
Created 02-06-2019 08:52 AM
I see the problem (my problem). I hope this triggers some answer from the HDP community.
My AD user is in CamelCase and so I get this in the yarn logs on the node where the app was running:
java.io.IOException: Owner 'myusername' for path /grid/1/hadoop/yarn/log/application_1549437510290_0039/container_e80_1549437510290_0039_01_000001/stderr did not match expected owner 'MyUserName'
User names are different in case only and so log aggregation fails and /app-logs/MyUserName/logs/application_1549437510290_0039/ is empty! Please help!