Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

CDH 5.3.3 jobhistory removed on RPC call

Solved Go to solution

CDH 5.3.3 jobhistory removed on RPC call

Explorer

Hi,

I need help to understand a issue I am getting with jobHistory of my mapreduce job. I am submitting the Java MR2 code remotely using RPC from my web application which is running inside the WebSphere. Application submission work perfectly fine, give back job_id. Application works fine on backend but after job complete jobhistory url stop working. We cant see the job history. Real problem is we are not able to read the job status from RPC as JobClient comes pack with job not found error.

 

When we try to submit the job from java application outside the WebSphere it works as expected and keep jobhistory.

 

I am thinking there must be some property or setting needed for keep the jobHistory saved in cluster so that we can read the job status.

Any help in this regard is appreciated.

 

Thanks,
MG

1 ACCEPTED SOLUTION

Accepted Solutions

Re: CDH 5.3.3 jobhistory removed on RPC call

Master Guru
Does your WebSphere app load a custom set of configs to talk to the remote cluster? Are the JHS configs part of the config set, if so?

The below properties are all necessary in having the MR2 job register itself with the JHS for post-job persistence - get these property values to precisely match with the working 'hadoop jar' command host's /etc/hadoop/conf/mapred-site.xml:

mapreduce.jobhistory.address
mapreduce.jobhistory.webapp.address (OR) mapreduce.jobhistory.webapp.https.address
yarn.app.mapreduce.am.staging-dir
4 REPLIES 4

Re: CDH 5.3.3 jobhistory removed on RPC call

Master Guru
Does your WebSphere app load a custom set of configs to talk to the remote cluster? Are the JHS configs part of the config set, if so?

The below properties are all necessary in having the MR2 job register itself with the JHS for post-job persistence - get these property values to precisely match with the working 'hadoop jar' command host's /etc/hadoop/conf/mapred-site.xml:

mapreduce.jobhistory.address
mapreduce.jobhistory.webapp.address (OR) mapreduce.jobhistory.webapp.https.address
yarn.app.mapreduce.am.staging-dir

Re: CDH 5.3.3 jobhistory removed on RPC call

Explorer
Harsh - Thanks for the properties name info. This solve my problem.
 
I was setting mapreduce.jobhistory.address and mapreduce.jobhistory.webapp.address but not the yarn.app.mapreduce.am.staging-dir. Once I set this third propery it solve the problem
 
What I learn in this process is when we make RPC call from inside IBM Webshpere we need to set all the properties expecilty which are required. Where as when I make a call from standalone program using same JDK it took some implicit properties

 

 

Thanks,

MG

Re: CDH 5.3.3 jobhistory removed on RPC call

Master Guru
Glad to hear you were able to resolve it. Please consider marking the topic
solved so others with similar issues may find it quicker!

If the directory of config XML files is on the classpath of the program,
then the Configuration instance will automatically discover them
(ClassLoader.getResource(…) style).

Re: CDH 5.3.3 jobhistory removed on RPC call

Explorer

Once again thanks.

Regarding using full xml resource load. I understand that is always good. But in our environment its not permitted hence endup adding one by one.

I am trying to convince ops team to adopt conf folder approach.

Thanks,
MG

Don't have an account?
Coming from Hortonworks? Activate your account here