Community Articles

SBren · ‎04-03-2017

PROBLEM: user1 has submitted a yarn application with the application id of application_1473860344791_0001. To look at the yarn logs you would execute the following command;

yarn logs -applicationId application_1473860344791_0001

If you run the command above as 'user2' you may see output similar to the following;

16/09/19 23:00:23 INFO impl.TimelineClientImpl: Timeline service address: http://mycluster.somedomain.com:8188/ws/v1/timeline/
16/09/19 23:00:23 INFO client.RMProxy: Connecting to ResourceManager at mycluster.somedomain.com/192.168.1.89:8050
/app-logs/user2/logs/application_1473860344791_0001 does not exist.
Log aggregation has not completed or is not enabled.

ROOT CAUSE: When log aggregation has been enabled each users application logs will, by default, be placed in the directory hdfs:///app-logs/<USERNAME>/logs/<APPLICATION_ID>. By default only the user that submitted the job and members of the hadoop group will have access to read the log files. In the example directory listing below you can see that the permissions are 770. No access for anyone other than the owner and members of the hadoop group.

[root@mycluster ~]$ hdfs dfs -ls /app-logs
Found 3 items
drwxrwx---   - hive     hadoop          0 2017-03-10 15:33 /app-logs/hive
drwxrwx---   - user1    hadoop          0 2017-03-10 15:37 /app-logs/user1
drwxrwx---   - spark    hadoop          0 2017-03-10 15:39 /app-logs/spark

SOLUTION: The message above can be deceiving and does not necessarily indicate that log aggregation has not been enabled. To obtain yarn logs for an application the 'yarn logs' command must be executed as the user that submitted the application. In the example below the application was submitted by user1. If we execute the same command as above as the user 'user1' we should get the following output if log aggregation has been enabled.

yarn logs -applicationId application_1473860344791_0001
16/09/19 23:10:33 INFO impl.TimelineClientImpl: Timeline service address: http://mycluster.somedomain.com:8188/ws/v1/timeline/
16/09/19 23:10:33 INFO client.RMProxy: Connecting to ResourceManager at mycluster.somedomain.com/192.168.1.89:8050
16/09/19 23:10:34 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
16/09/19 23:10:34 INFO compress.CodecPool: Got brand-new decompressor [.deflate]
Container: container_e03_1473860344791_0001_01_000001 on mycluster.somedomain.com_45454
===============================================================================
LogType:stderr
Log Upload Time:Wed Sep 14 09:44:15 -0400 2016
LogLength:0
Log Contents:
End of LogType:stderr

.....truncated output.....

REFERENCE: The following document describes how to use log aggregation to collect logs for long-running YARN applications.

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_yarn-resource-management/content/ch_log_a...

Cloudera Community

Community Articles

Unable to obtain logs from a yarn application. How can I get the logs?

Apache YARN