Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

Solved Go to solution

Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

New Contributor

I am trying to run simple spark job using oozie workflow scheduler. And I am getting error as

System memory 202375168 must be at least 4.718592E8. Please use a larger heap size. i have assigned 5 GB to my HDP sandbox on Virtual box.

I have created a spark jar on my local machine and uploaded the jar to HDP sandbox. My workflow.xml looks like as below.

<workflow-app name="samplespark-wf" xmlns="uri:oozie:workflow:0.4">

<start to="sparkjob"/> <action name="sparkjob">

<spark xmlns="uri:oozie:spark-action:0.1">

<job-tracker>${jobTracker}</job-tracker>

<name-node>${nameNode}</name-node>

<master>local[1]</master>

<name>Spark Test</name>

<class>main.scala.RDDscala.RDD1</class>

<jar>${nameNode}/spark_oozie_action/sparkrdd_2.11-0.0.1.jar</jar>

<spark-opts>--driver-memory 5g --num-executors 1</spark-opts>

</spark> <ok to="end"/>

<error to="fail"/> </action>

<kill name="fail">

<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <kill name="fail-output">

<message>Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]</message> </kill> <end name="end"/> </workflow-app>

This program runs fine on local system command prompt with below mentioned command

spark-submit --class main.scala.RDDscala.RDD2 --master local target\scala-2.11\sparkrdd_2.11-0.0.1.jar

Any help would be appreciated

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

Contributor

It seems your spark driver is running with very small heap size, please try increasing the java.driver memory and see if it helps. Use this parameter (e.g.) when submitting the job:

--driver-memory 1g

8 REPLIES 8

Re: Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

Contributor

It seems your spark driver is running with very small heap size, please try increasing the java.driver memory and see if it helps. Use this parameter (e.g.) when submitting the job:

--driver-memory 1g

Re: Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

New Contributor

Hi Peter,

I have already specified my driver-memory to 5g in spark_opts in workflow.xml. I am still getting the same error.

Does this has something to do with memory assigned to HDP 2.5 on Virtual box. In my case it is 5 GB?

Re: Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

New Contributor

Hi Peter,

I have already specified my driver-memory to 5g in spark_opts in workflow.xml. I am still getting the same error.

Does this has something to do with memory assigned to HDP 2.5 on Virtual box. In my case it is 5 GB?

Re: Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

Contributor

Ah, sorry:) Yes, here you can't specify driver related parameters using <spark-opts>--driver-memory 10g</spark-opts> because your driver (oozie launcher job) is already launched before that point. It's a oozie launcher (which is a mapreduce job) launches your actual spark job and so spark-opts is not relevant. But the Oozie spark action doc says:

The configuration element, if present, contains configuration properties that are passed to the Spark job. This is shouldn't be spark configuration. It should be mapreduce configuration for launcher job.

So, please try to add the following

<configuration>

<property>

<name>oozie.launcher.mapreduce.map.memory.mb</name>

<value>4096</value>

</property>

<property>

<name>oozie.launcher.mapreduce.map.java.opts</name>

<value>-Xmx3072m</value>

</property>

</configuration>

Highlighted

Re: Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

New Contributor

Hi Peter,

Thank you so much for such a clear answer. I tried the steps mentioned below and set the individual values as 4096 and 3072 but my job failed due to "MAP capability required is more than the supported max container capability in the cluster". I checked the properties "mapreduce.map.memory.mb" and "mapreduce.map.java.opts" in mapred-site.xml and their values mentioned are 250 and -Xmx200m. So This might be the reason my job is getting killed as it is requesting container size more than default values.

Any workaround for this? If i update the values in mapred-site.xml to above mentioned values, then which services i need to restart to reflect those changes? Or can it be resolved in any other way? By the way I am running HDP 2.5

Thanks

Rahul

Re: Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

Contributor

The two values were just examples. Try to change them to something less that fits your system environment. Either go for less than 512 (might do the job) or increase the ram assigned to the container:

  1. increase VirtualBox memory from (I guess) 4096 to (e.g.) 8192
  2. Log into Ambari from http://my.local.host:8080
  3. change the values of yarn.nodemanager.resource.memory-mb and yarn.scheduler.maximum-allocation-mb from the defaults to 4096
  4. Save and restart (at lease yarn, oozie, spark)

Re: Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

New Contributor

Hi Peter,

Just a small question. My Spark oozie workflows keeps on running from long time. When i checked oozie logs i found it is trying to connect to port 8032 on sandbox.hortonworks.com. I donot know why it is going to 8032 instead of 8050 although i have mentioned 8050 in my job.properties.

Any idea?

Thanks

Rahul

Re: Error in running Spark Job using oozie workflow in local mode.(System memory 202375168 must be at least 4.718592E8. Please use a larger heap size.)

Expert Contributor

@rahul gulati Earlier I observed that this similar exception occurred at the time of launching of Oozie workflow. Can you try to set following memory related parameter in Oozie workflow.xml with some higher value like 1024mb so that workflow launches successfully.

For e.g:

<property> <name>oozie.launcher.mapred.map.child.java.opts</name> <value>-Xmx1024m</value> </property>

See if this helps you.

Don't have an account?
Coming from Hortonworks? Activate your account here