Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Logs disappearing after Yarn Log Aggregation

avatar
New Contributor

I'm trying to view aggregated logs - everything seems to be set up correctly, and it appears logs are aggregating without issue. However, they're.
I have yarn.log-aggregation-enable set to true and yarn.nodemanager.remote-app-log-dir set to /tmp

/tmp is owned by hdfs and is accessible to the hadoop group

I have yarn.log-aggregation.retain-seconds is set to 2419200.

Once the job finished I don't see the directory /tmp/logs (or /tmp/{user} or any other new directory in /tmp) in hdfs. I am unable to view the job logs through Yarn UI.

Running yarn logs -applicationId {application_id} returns no results.

When I check on the job status in the Yarn UI, it shows Log Aggregation Status: Succeeded.

Resource manager logs are available and show no errors.

Any ideas as to where these logs are disappearing to?

We're running hadoop 2.5.3

7 REPLIES 7

avatar
Expert Contributor

Can you change ownership of /tmp in HDFS to yarn:hadoop instead of hdfs:hadoop? Is it a secure cluster?

avatar
New Contributor

@Gour Saha I've set yarn.nodemanager.remote-app-log-dir=/app-logs and then set the owner of /app-logs to yarn:hadoop and still the problem remains (obviously now I'm checking /app-logs instead of /tmp for log files)
The cluster is not secure - Kerberos is disabled.

avatar
Expert Contributor

Can you check the NM log in the host where at least one container of your job ran to see if you find any errors related to log-aggregation?

avatar
New Contributor

@Gour Saha

Pulling the logs from the Node manager... No errors of any kind

Application finishes, it says it's uploading the logs for each container, and then moves on to the next thing.
Only thing that may be relevant is Current good log dirs are /mnt/resource/hadoop/yarn/log which is I think referring to the local log directory.

There is also (Slf4jLog.java:info(67)) - Aliases are enabled

Nothing else of note.

avatar
Expert Contributor

What kind of apps are you running? Can you check under the {application_id}/{container_id} directories under yarn.nodemanager.log-dirs if the apps are creating any logs? Are the apps creating a sub-directory under the container dir and then logging under that? Note, logs created under sub-directories are not aggregated.

avatar
Explorer

I have a very similar issue. After integratin the entire cluster, all services including Ranger, with the Active Directory, the YARN application log folders for Active Directory users are empty.

This does not happen when the application is run as the hdfs user, so it has to be a permissions problem.

While an application is running I can see the log folders and files on the node running the container for the application:

/opt/hadoop/yarn/log/application_1548066747260_2576
/opt/hadoop/yarn/log/application_1548066747260_2576/container_e71_1548066747260_2576_01_000001
/opt/hadoop/yarn/log/application_1548066747260_2576/container_e71_1548066747260_2576_01_000001/launch_container.sh
/opt/hadoop/yarn/log/application_1548066747260_2576/container_e71_1548066747260_2576_01_000001/directory.info

As soon as the application stops the log folder is created in hdfs, but it is empty.

The Ranger policies have been set up to give service accounts (e.g. hdfs, hive, spark etc.) full access to the entire hdfs file structure.

AD users who are allowed to submit spark jobs are given access to these folders in an attempt to allow application logs to work:

  • /tmp
  • /mr-history
  • /ats
  • /spark-history
  • /spark2-history
  • /livy-recovery
  • /livy2-recovery
  • /app-logs
  • /app-logs/oozie
  • /app-logs/{USER}
  • /user/{USER}

Any suggestions will be greatly appreciated.

I can post more configuration and settings information if needed.

avatar
Explorer

I see the problem (my problem). I hope this triggers some answer from the HDP community.

My AD user is in CamelCase and so I get this in the yarn logs on the node where the app was running:

java.io.IOException: Owner 'myusername' for path /grid/1/hadoop/yarn/log/application_1549437510290_0039/container_e80_1549437510290_0039_01_000001/stderr did not match expected owner 'MyUserName'

User names are different in case only and so log aggregation fails and /app-logs/MyUserName/logs/application_1549437510290_0039/ is empty! Please help!