Reply
New Contributor
Posts: 5
Registered: ‎04-27-2016

Problem running spark with oozie

I am running a spark streaming application inside oozie. It runs good in dev environment where the spark version is 1.3.0. But in other region it fails with below error. 

2016-04-27 05:39:58,142 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.IllegalArgumentException: Invalid ContainerId: container_e31_1461603486630_0010_01_000001


This region has 1.5 version of spark. The CDH version is 5.5.0. I understand it is something to do with the spark version but the oozie share lib's spark-core jar version is also 1.5.Below is my workflow.xml and properties file. I also tried adding the classpath directly in workflow.xml but that doesn't help either.

<workflow-app xmlns="uri:oozie:workflow:0.1" name="sample" >
<start to="spark-stream" />
<action name="spark-stream">
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${job_tracker}</job-tracker>
<name-node>${name_node}</name-node>
<master>yarn-cluster</master>
<mode>cluster</mode>
<name>jobname</name>
<class>classname</class>
<jar>/path/jar</jar>
<spark-opts>--executor-cores 100 --executor-memory 12G --driver-memory 4G --conf spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/client/* spark.driver.extraClassPath=/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/client/*</spark-opts>
</spark>
<ok to="end"/>
<error to="mail_users"/>
</action>
<action name="mail_users">
<email xmlns="uri:oozie:email-action:0.1">
<to>sample@sample.com</to>
<subject>Job failed</subject>
<body>Please check</body>
</email>
<ok to="end"/>
<error to="end"/>
</action>
<end name="end" />
</workflow-app>


Properties file:

job_tracker=server:8032
name_node=hdfs://server:8020
examplesRoot=rootdir
oozie.use.system.libpath=true
oozie.wf.application.path=${name_node}/test/${examplesRoot}

 

The job jar which is in the lib folder has spark version of 1.5.0 of CDH version 5.5.0. Could someone suggest a workaround or solution for this.

 
Cloudera Employee
Posts: 3
Registered: ‎09-29-2015

Re: Problem running spark with oozie

Does the code work when submitted through spark submit?

 

It's most probable that you have some dependency pulling in an older version of hadoop/yarn libraries. Look for hadoop or yarn jar files in you package. Also, "yarn-version-info.properties" file should contain the version information. The  CDH 5.5 is based off the Hadoop/Yarn 2.6.0 and ideally you'll be using Cloudera provided dependency package. The version in that case should be "2.6.0-cdh5.5". More information on dependency jars that Cloudera provides can be found here:

http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_vd_cdh5_maven_repo.html#concept_x...

 

Announcements