I changed the entries in the files listed above from yarn-tez to yarn and bounced oozie, hive and yarn. No luck. Then I decided to remove yarn using the ambari rest interface. I didnt manage to do that but I had stopped oozie to do it. On restarting Oozie this problem was cleared and my Oozie workflows now execute. Strange ..
I think that the yarn-tez value is set by default because tez is integrated into HDP 2.4 and the tez install guide at in step 7. states that this value shoudl be set to yarn-tez. Il try the change, you never know there might be a conflict.
I have HDP 2.4 installed on CentOS 6.8 on 6 virtual KVM instances based on two
physical machines. I have been having problems with oozie jobs where the work flows
call either hive or spark based actions. The error that I have encountered is
2016-08-18 11:01:18,134 WARN ActionStartXCommand:523 - SERVER[] USER[oozie] GROUP[-] TOKEN[] APP[PoleLocationsForNec] JOB[0000001-160818105046419-oozie-oozi-W] ACTION[0000001-160818105046419-oozie-oozi-W@hive-select-data] Error starting action [hive-select-data]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for and the correspond server addresses.]
org.apache.oozie.action.ActionExecutorException: JA009: Cannot initialize Cluster. Please check your configuration for and the correspond server addresses.
I have encountered this error on jobs that did already work so I cant think what has
changed. My work flow looks like this
<workflow-app name='PoleLocationsForNec'
<start to='hive-select-data'/>
<action name="hive-select-data">
<hive xmlns="uri:oozie:hive-action:0.2">
<ok to="hdfs-move-file"/>
<error to="fail"/>
<action name="hdfs-move-file">
<move source='${sourceFile}' target='${targetFile}${SaveDateString}'/>
<ok to="sqoop-copy"/>
<error to="fail"/>
<action name="sqoop-copy">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<ok to="cleanup"/>
<error to="fail"/>
<action name="cleanup">
<delete path='${triggerFileDir}' />
<ok to="end"/>
<error to="fail"/>
<kill name='fail'>
<message> An error occurred - message[${wf:errorMessage(wf:lastErrorNode())}]
<end name='end'/>
and my job configuration looks like this. <configuration>
</configuration> I have obviously researched this issue and understand that it related to the definition
of or the hdfs / resource manager server address. But given that
this job has worked in the past I thought that this error might be masking another issue. ??
The value of is defined in the following files
/etc/hadoop/conf/mapred-site.xml => yarn-tez
/etc/hive/conf/mapred-site.xml => yarn-tez
/etc/oozie/conf/hadoop-config.xml => yarn
/etc/pig/conf/ => yarn
I have checked all of the logs but all I see is the JA009 error in the oozie logs. I just wondering
whether anyone else has encountered this error or can suggest some other area that I can examine.
