New Contributor
Posts: 5
Registered: ‎04-27-2016

Problem running spark with oozie

I am running a spark streaming application inside oozie. It runs good in dev environment where the spark version is 1.3.0. But in other region it fails with below error. 

2016-04-27 05:39:58,142 ERROR [main] Error starting MRAppMaster
java.lang.IllegalArgumentException: Invalid ContainerId: container_e31_1461603486630_0010_01_000001

This region has 1.5 version of spark. The CDH version is 5.5.0. I understand it is something to do with the spark version but the oozie share lib's spark-core jar version is also 1.5.Below is my workflow.xml and properties file. I also tried adding the classpath directly in workflow.xml but that doesn't help either.

<workflow-app xmlns="uri:oozie:workflow:0.1" name="sample" >
<start to="spark-stream" />
<action name="spark-stream">
<spark xmlns="uri:oozie:spark-action:0.1">
<spark-opts>--executor-cores 100 --executor-memory 12G --driver-memory 4G --conf spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/client/* spark.driver.extraClassPath=/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/client/*</spark-opts>
<ok to="end"/>
<error to="mail_users"/>
<action name="mail_users">
<email xmlns="uri:oozie:email-action:0.1">
<subject>Job failed</subject>
<body>Please check</body>
<ok to="end"/>
<error to="end"/>
<end name="end" />

Properties file:



The job jar which is in the lib folder has spark version of 1.5.0 of CDH version 5.5.0. Could someone suggest a workaround or solution for this.

Cloudera Employee
Posts: 3
Registered: ‎09-29-2015

Re: Problem running spark with oozie

Does the code work when submitted through spark submit?


It's most probable that you have some dependency pulling in an older version of hadoop/yarn libraries. Look for hadoop or yarn jar files in you package. Also, "" file should contain the version information. The  CDH 5.5 is based off the Hadoop/Yarn 2.6.0 and ideally you'll be using Cloudera provided dependency package. The version in that case should be "2.6.0-cdh5.5". More information on dependency jars that Cloudera provides can be found here: