Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

YARN JobHistory Logs (http:<server>:19888/jobhistory) not shown

avatar
Contributor

When I launch a Spark job, logs being created into ther HDFS /tmp/logs/<user-id>/logs folder but NOT in /user/history/ folders!
Then, when I launch the JobHistory portal (http://<YARN-JobHistory-Server>:19888/jobhuistory​) it shows no jobs!!!
Is there a daemon that copies the logs from the /tmp/logs/<user-id>/logs fodler to the /user/history/done & /user/history/done_intermediate ones?

 

Thank you in advance!

 

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi TS,

 

Responses inline below:

 

> Having said that the JobHistory Server is specific to Map Reduce jobs run on YARN, where other type of jobs will be shown?

That will depend on what kind of application is being submitted to the YARN framework.  

 

We know that if a MR2 job is submitted, the job details will be available while the job is running within the Resource Manager Web UI (as this is part of the YARN framework); When the job is completed, the job details will be available via the Job History Server.

 

If a Spark-on-YARN job was is submitted, the job details will still be availabile while the job is running within the Resource Manager Web UI, however when the job completes, the job details will then be available on the Spark History Server, which is a separate role/service that is configured when Spark-on-YARN if setup as a service in Cloudera Manager (or when configuring it in CDH, per our installation guide).

 

> Besides MR and Spark jobs, what other types of jobs can we launch via YARN?

MR and Spark jobs are what is currently supported, however this may change in the future, as the need arises.  YARN is application agnostic and is intentionally designed to allow developers to create applications to run on its distributed framework.

 

Additonal details regarding YARN applications are available here, from this link.

 

 

> Are jobs moved from /tmp/logs/<user-id>/logs folder to /user/history/done & /user/history/done_intermediate ones?

> Are they created simultaneously?

To best clarify the answer, listed below is a brief overview of the order of operations of a MR job in YARN:

1) MR job submitted to RM from client

2) Application folder is created in /tmp/logs/<user-id>/logs/application_xxxxxxxxxxxx_xxxx

3) MR job runs in YARN on the cluster

4) MR job completes, counters from job are reported on job client that submitted job

5) Counter information (.jhist file) and job_conf.xml files are written to /user/history/done_intermediate/<user>/job_xxxxxxxxxx_xxxx*

6) .jist file and job_conf.xml are then moved from /user/history/done_intermediate/<user>/ to /user/history/done

7) Container logs from each Node Manager is aggregated into /tmp/logs/<user-id>/logs/application_xxxxxxxxxxxx_xxxx

 

 

Hope ths helps!

 

 

View solution in original post

8 REPLIES 8

avatar
Expert Contributor

Hi TS,

 

Thanks for your post.  In regards to what you have reported, is the issue that you're seeing specific only to Spark jobs submitted to YARN?  If that's the case, it's important to note that the Job History Server in is specific to Map Reduce jobs run on YARN and not actually for Spark.  The history of Spark jobs submitted to YARN is handled by a completely separate service called the Spark History Server.

 

Are you able to run a simple Pi Mapreduce job submitted to YARN, and does that appear in the JHS Web UI once completed?

 

avatar
Contributor

Anthony, thank you for the clarification!

Having said that (JHServer is specific to Map Reduce jobs run on YARN) where other type of jobs will be shown?

(You said tha Spark has its own JHS...)

BTW: Besides M/R and Spark jobs what other types of jobs can we launch via YARN?

 

How jobs are moved from /tmp/logs/<user-id>/logs fodler to /user/history/done & /user/history/done_intermediate ones?

Or they are created simultaneously?

 

Thank you for your assistance, it is much appreciated!

 

avatar
Expert Contributor

Hi TS,

 

Responses inline below:

 

> Having said that the JobHistory Server is specific to Map Reduce jobs run on YARN, where other type of jobs will be shown?

That will depend on what kind of application is being submitted to the YARN framework.  

 

We know that if a MR2 job is submitted, the job details will be available while the job is running within the Resource Manager Web UI (as this is part of the YARN framework); When the job is completed, the job details will be available via the Job History Server.

 

If a Spark-on-YARN job was is submitted, the job details will still be availabile while the job is running within the Resource Manager Web UI, however when the job completes, the job details will then be available on the Spark History Server, which is a separate role/service that is configured when Spark-on-YARN if setup as a service in Cloudera Manager (or when configuring it in CDH, per our installation guide).

 

> Besides MR and Spark jobs, what other types of jobs can we launch via YARN?

MR and Spark jobs are what is currently supported, however this may change in the future, as the need arises.  YARN is application agnostic and is intentionally designed to allow developers to create applications to run on its distributed framework.

 

Additonal details regarding YARN applications are available here, from this link.

 

 

> Are jobs moved from /tmp/logs/<user-id>/logs folder to /user/history/done & /user/history/done_intermediate ones?

> Are they created simultaneously?

To best clarify the answer, listed below is a brief overview of the order of operations of a MR job in YARN:

1) MR job submitted to RM from client

2) Application folder is created in /tmp/logs/<user-id>/logs/application_xxxxxxxxxxxx_xxxx

3) MR job runs in YARN on the cluster

4) MR job completes, counters from job are reported on job client that submitted job

5) Counter information (.jhist file) and job_conf.xml files are written to /user/history/done_intermediate/<user>/job_xxxxxxxxxx_xxxx*

6) .jist file and job_conf.xml are then moved from /user/history/done_intermediate/<user>/ to /user/history/done

7) Container logs from each Node Manager is aggregated into /tmp/logs/<user-id>/logs/application_xxxxxxxxxxxx_xxxx

 

 

Hope ths helps!

 

 

avatar
Contributor
Thank you for the clarification!

avatar
Expert Contributor
if this get fail because of permission issue
Counter information (.jhist file) and job_conf.xml files are written to /user/history/do ne_intermediate/<user>/job_xxxxxxxxxx_xxxx*

can we create .jhist and .xml file again ?

avatar
Expert Contributor

Hi MSharma,

 

Would you be able to provide additional context regarding the failure / permission issue that you're experiencing?

 

If there's a specific error message or symptom that is occurring could you provide more details as to what is happening?

 

 

 

avatar
Expert Contributor

this is resolved now , there was a permission issue .after setting the right permission issue resolved 

avatar
Expert Contributor

Thanks for the update MSharma, glad to hear that you were able to resolve the issue!