Member since
06-11-2016
6
Posts
1
Kudos Received
0
Solutions
08-19-2016
02:18 AM
I changed the entries in the files listed above from yarn-tez to yarn and bounced oozie, hive and yarn. No luck. Then I decided to remove yarn using the ambari rest interface. I didnt manage to do that but I had stopped oozie to do it. On restarting Oozie this problem was cleared and my Oozie workflows now execute. Strange ..
... View more
08-18-2016
04:01 AM
I think that the yarn-tez value is set by default because tez is integrated into HDP 2.4 and the tez install guide at https://tez.apache.org/install.html in step 7. states that this value shoudl be set to yarn-tez. Il try the change, you never know there might be a conflict.
... View more
08-18-2016
02:38 AM
1 Kudo
I have HDP 2.4 installed on CentOS 6.8 on 6 virtual KVM instances based on two
physical machines. I have been having problems with oozie jobs where the work flows
call either hive or spark based actions. The error that I have encountered is
2016-08-18 11:01:18,134 WARN ActionStartXCommand:523 - SERVER[hc1m1.nec.co.nz] USER[oozie] GROUP[-] TOKEN[] APP[PoleLocationsForNec] JOB[0000001-160818105046419-oozie-oozi-W] ACTION[0000001-160818105046419-oozie-oozi-W@hive-select-data] Error starting action [hive-select-data]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.]
org.apache.oozie.action.ActionExecutorException: JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
I have encountered this error on jobs that did already work so I cant think what has
changed. My work flow looks like this
<workflow-app name='PoleLocationsForNec'
xmlns="uri:oozie:workflow:0.5"
xmlns:sla="uri:oozie:sla:0.2">
<start to='hive-select-data'/>
<action name="hive-select-data">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<script>select_pole_locations_for_nec.hql</script>
</hive>
<ok to="hdfs-move-file"/>
<error to="fail"/>
</action>
<action name="hdfs-move-file">
<fs>
<move source='${sourceFile}' target='${targetFile}${SaveDateString}'/>
</fs>
<ok to="sqoop-copy"/>
<error to="fail"/>
</action>
<action name="sqoop-copy">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<arg>export</arg>
<arg>--options-file</arg>
<arg>${sqoopOptFile}</arg>
<file>${sqoopOptFile}</file>
</sqoop>
<ok to="cleanup"/>
<error to="fail"/>
</action>
<action name="cleanup">
<fs>
<delete path='${triggerFileDir}' />
</fs>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name='fail'>
<message> An error occurred - message[${wf:errorMessage(wf:lastErrorNode())}]
</message>
</kill>
<end name='end'/>
</workflow-app>
and my job configuration looks like this. <configuration>
<property>
<name>hdfsUser</name>
<value>oozie</value>
</property>
<property>
<name>WaitForThisInputData</name>
<value>hdfs://hc1m1.nec.co.nz:8020/mule/sheets/input/PoleLocationsForNec/trigger/</value>
</property>
<property>
<name>wfPath</name>
<value>hdfs://hc1m1.nec.co.nz:8020/user/oozie/wf/PoleLocationsForNec/</value>
</property>
<property>
<name>user.name</name>
<value>oozie</value>
</property>
<property>
<name>sqoopOptFile</name>
<value>sqoop_options_file.opt</value>
</property>
<property>
<name>oozie.coord.application.path</name>
<value>hdfs://hc1m1.nec.co.nz:8020/user/oozie/wf/PoleLocationsForNec</value>
</property>
<property>
<name>sourceFile</name>
<value>hdfs://hc1m1.nec.co.nz:8020/mule/sheets/input/PoleLocationsForNec/Pole_locations_for_NEC_edit.csv</value>
</property>
<property>
<name>mapreduce.job.user.name</name>
<value>oozie</value>
</property>
<property>
<name>Execution</name>
<value>FIFO</value>
</property>
<property>
<name>triggerFileDir</name>
<value>hdfs://hc1m1.nec.co.nz:8020/mule/sheets/input/PoleLocationsForNec/trigger/</value>
</property>
<property>
<name>coordFreq</name>
<value>30</value>
</property>
<property>
<name>Concurrency</name>
<value>1</value>
</property>
<property>
<name>targetFile</name>
<value>/mule/sheets/store/Pole_locations_for_NEC_edit.csv</value>
</property>
<property>
<name>jobTracker</name>
<value>hc1r1m2.nec.co.nz:8050</value>
</property>
<property>
<name>startTime</name>
<value>2016-07-31T12:01Z</value>
</property>
<property>
<name>wfProject</name>
<value>PoleLocationsForNec</value>
</property>
<property>
<name>targetDir</name>
<value>/mule/sheets/store/</value>
</property>
<property>
<name>dataFreq</name>
<value>30</value>
</property>
<property>
<name>nameNode</name>
<value>hdfs://hc1m1.nec.co.nz:8020</value>
</property>
<property>
<name>doneFlag</name>
<value>done_flag.dat</value>
</property>
<property>
<name>oozie.libpath</name>
<value>hdfs://hc1m1.nec.co.nz:8020/user/oozie/share/lib</value>
</property>
<property>
<name>oozie.use.system.libpath</name>
<value>true</value>
</property>
<property>
<name>oozie.wf.rerun.failnodes</name>
<value>true</value>
</property>
<property>
<name>moveFile</name>
<value>Pole_locations_for_NEC_edit.csv</value>
</property>
<property>
<name>SaveDateString</name>
<value>-20160817-230100</value>
</property>
<property>
<name>triggerDir</name>
<value>trigger/</value>
</property>
<property>
<name>sourceDir</name>
<value>hdfs://hc1m1.nec.co.nz:8020/mule/sheets/input/PoleLocationsForNec/</value>
</property>
<property>
<name>oozie.wf.application.path</name>
<value>hdfs://hc1m1.nec.co.nz:8020/user/oozie/wf/PoleLocationsForNec</value>
</property>
<property>
<name>endTime</name>
<value>2099-01-01T12:00Z</value>
</property>
<property>
<name>TimeOutMins</name>
<value>10</value>
</property>
<property>
<name>timeZoneDef</name>
<value>GMT+12:00</value>
</property>
<property>
<name>workflowAppPath</name>
<value>hdfs://hc1m1.nec.co.nz:8020/user/oozie/wf/PoleLocationsForNec</value>
</property>
</configuration> I have obviously researched this issue and understand that it related to the definition
of mapreduce.framework.name or the hdfs / resource manager server address. But given that
this job has worked in the past I thought that this error might be masking another issue. ??
The value of mapreduce.framework.name is defined in the following files
/etc/hadoop/conf/mapred-site.xml => yarn-tez
/etc/hive/conf/mapred-site.xml => yarn-tez
/etc/oozie/conf/hadoop-config.xml => yarn
/etc/pig/conf/pig-env.sh => yarn
I have checked all of the logs but all I see is the JA009 error in the oozie logs. I just wondering
whether anyone else has encountered this error or can suggest some other area that I can examine.
... View more
Labels: