Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Yarn log history?

avatar
Contributor

I know that our user logs and backups to go /app-logs/<USER> and /app-logs/backup (respectively) in HDFS. However, when I parse the local job log, /var/log/hadoop-yarn/yarn/hadoop-mapreduce.jobsummary.log, the last I get is 2017-01-25. The WebUI for Yarn Resource Manger only shows about a day's worth of finished jobs. How can I review (for auditing especially) more comprehensive history? Am I stuck with reviewing logs in user/backup HDFS dirs based on date written? We want to have a trail of what was ran.

I noticed the WebUI for Yarn Resource Manager lists ""Showing 1 to 100 of 9,994 entries" at the bottom when looking at the finished jobs, so they must be somewhere.

9 REPLIES 9

avatar
Master Guru

Try "yarn application -list" to get a list of all Yarn apps, followed by "yarn logs -applicationId" to get logs for a particular app. You can find details about these and other yarn commands here.

avatar
Contributor

I already know all of what you mentioned. If you read my OP, I am asking for historical history of all jobs, which a snippet of is what you see in Resource Manager. Re-read what I wrote.

avatar
Master Guru

Have you tried "yarn application -list -appStates ALL" ?

avatar
Contributor

This is helpful, but I am looking more for also what is beyond* that threshold in history. If not, what would property would extend that beyond the apparent 24 hours that it seems Resource Manager / Yarn shows for finished applications.

avatar
Master Guru

The "historical period" or TTL of Yarn logs is controlled by yarn.timeline-service.ttl-enable, and if "true", then by yarn.timeline-service.ttl-ms. By default it's enabled and TTL is 31 days.

avatar
Guru

@Michael DeGuzis, Yarn typically stores history of all the application in either Mapreduce History server (only for Mapreduce jobs) or Application Timeline Server ( all type of yarn applications). Kindly, verify that ATS ( application timeline server ) is installed on your cluster. Look for below property in yarn-site.xml

<property>
  <name>yarn.timeline-service.enabled</name>
  <value>true</value>
</property>

If ATS is installed on your cluster, look property in yarn-site.xml to find out timeline server web app address.

<property>
<name>yarn.timeline-service.webapp.address</name>
<value>host1:8188</value>
</property>

Now, you can find the list of all applications using below api.

http://host1:8188/ws/v1/applicationhistory/apps

Above Rest api call with return a json object containing all the applications with its metadata such as application_id, user, name, queue, appState etc.

You can also find details on Timeline server at https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/TimelineServer.html

avatar
Contributor

If I try that, I get 'curl: (35) SSL received a record that exceeded the maximum permissible length.'

avatar
Contributor

Also, if this only shows more verbose output than 'yarn application -list -appStates ALL', that isn't entirely great either, as I am looking for also what is beyond* that threshold in history. If not, what would extend that beyond the apparent 24 hours that it seems Resource Manager / Yarn shows for finished applications.

avatar
Guru

Timeline server will typically have all the history data. Do you have SSL / Wire encryption enabled in the cluster ? Look for "yarn.http.policy" property in yarn-site.xml. If it has "HTTPS_ONLY" then SSL is enabled.

If SSL is enabled, look for "yarn.timeline-service.webapp.https.address" in yarn-site to find out https port. By default it uses 8190 as port.

You can try accessing Timeline server data as below.

SSL Enabled:
curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: application/json' --negotiate -u : 'https://host1:8190/ws/v1/applicationhistory/apps'

SSL Disabled:
curl -i -k -s -1 -H 'Content-Type: application/json'  -H 'Accept: application/json' --negotiate -u : 'http://host1:8188/ws/v1/applicationhistory/apps'