Member since
11-12-2014
5
Posts
0
Kudos Received
0
Solutions
12-30-2014
04:37 PM
Thanks Harsh! Really looking forward for this to be released! Just to clarify: is it going to be included in 5.3.x or only 5.4?
... View more
12-01-2014
12:18 PM
Ok, so I've managed to downgrade cluster to 5.1 and start my coordinator jobs. It seems to be working now as expected. So it seems to me there is some changes which breaks BC, but I really don't know what it is. At least I know that we shouldn't upgrade to 5.2 until we figure out how to solve this.
... View more
11-30-2014
12:10 PM
Yes, I'm using the Oozie streaming action. And streaming jar is the one which is bundled with CDH. What is strange is that we use the same workflows and coordinators in our production cluster which is on CDH 5.1. Here is example workflow action we use: <action name="raw-pass" retry-max="3" retry-interval="1">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${rawPassOutput}"/>
</prepare>
<streaming>
<mapper>mapreducers/bin/mapred run MyJobMapper</mapper>
<reducer>mapreducers/bin/mapred run MyJobReducer</reducer>
</streaming>
<configuration>
<property>
<!--
This will add avro.jar and avro-mapred.jar dependencies to the job
(@see mapred.input.format.class property below)
-->
<name>oozie.action.sharelib.for.map-reduce</name>
<value>mapreduce-streaming,hcatalog,sqoop</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>${rawPassInput}</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>${rawPassOutput}</value>
</property>
<property>
<!--
This input format will automagically decode avro files so
that our mappers will get plain json as input.
-->
<name>mapred.input.format.class</name>
<value>org.apache.avro.mapred.AvroAsTextInputFormat</value>
</property>
</configuration>
<archive>${mapreducersArchive}#mapreducers</archive>
</map-reduce>
<ok to="aggregate-pass"/>
<error to="failure-email-notification"/>
</action> I can not share actual mapper and reducer scripts though. Also - is there a relatively easy way to downgrade CDH installation from CDH 5.2 to 5.1? Or at least to install 5.1 from scratch? I'm not sure there was an option in Cloudera Manager to install 5.1, only 5.2... Thanks!
... View more
11-12-2014
10:18 PM
Hi, We have CDH 5.1 running on production for a few month already with no issues. Recently we've created another cluster for qa environment and installed CDH 5.2 through Cloudera Manager. And when we tried to some Oozie workflows (the same jobs as on production) we got following error: Error: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, received org.apache.hadoop.io.Text I've figured this error occured because hadoop tried to use IdentityMapper class instead of our streaming processors. I've tried a lot of different options but nothing helped so far. The closest I could get is to compare actual jobConf files that we get on production (CDH 5.1) and on new cluster (CDH 5.2). And I figured that on new cluster jobConf doesn't contain following properties: stream.map.streamprocessor stream.reduce.streamprocessor which if I understand correctly are used by hadoop-streaming.jar Have no idea where to look next. I would really appreciate any help with this issue. Thanks, Anatoly
... View more
Labels:
- Labels:
-
Apache Oozie