Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark2-Submit through Oozie shell action hangs and restarts spark application

avatar
Contributor

I have a Oozie job which starts a Oozie shell action,the shell action starts a spark application (spark2-submit). I am mostly doing spark sql. The jobs runs for a while and suddenly hangs. It starts the spark application all over again. 

 

I ran the same spark application in CDSW and it ran fine without issues. 

 

The same is happening with other Oozie job . The only common thing between these two jobs is that they run longer, around 2hrs. 

 

Any help will be helpful.

1 ACCEPTED SOLUTION

avatar
Contributor

The oozie mapper was running out of 4GB memory. I changed that to 8GB. Now the job ran fine without restarts. 

 

<configuration>
<property>
<name>oozie.launcher.mapreduce.map.memory.mb</name>
<value>8000</value>
</property>
<property>
<name>oozie.launcher.mapreduce.map.java.opts</name>
<value>-Xmx1500m</value>
</property>
<property>
<name>oozie.launcher.yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
<property>
<name>oozie.launcher.mapreduce.map.java.opts</name>
<value>-Xmx870m</value>
</property>
</configuration>

 

View solution in original post

14 REPLIES 14

avatar
Expert Contributor

Do you see some error or info related to timeout or anything indicating why the Application failed and had to be restarted. You can gather the yarn log for the application and check at the Application Master section to see what was the reason for failure. May be the Spark configuration being used at CDSW are differnt than the one used by spark submit.

avatar
Contributor

Thanks for reply. I tried to see logs from SparkUI but it was blank. It will be great if you can guide me on checking yarn logs. 

I tried below command and it gave me blank too

yarn logs -applicationId application_1543621459332_8094 > spark_app.log

 

 

avatar
Contributor

Most of the logs in spark UI are showing "No logs available for container"

avatar
Contributor

Hi,

We see the error as below.

"The container showed error ...Container [pid=68208,containerID=container_e59_1543621459332_7731_01_000002] is running beyond physical memory limits. Current usage: 4.0 GB of 4 GB physical memory used; 26.5 GB of 8.4 GB virtual memory used. Killing container..."

 

When I run in CDSW or through Oozie both have same memory and configurations for my spark application(executor memory,core,driver memory,memory overhead etc). From CDSW it never failed but when I run from Oozie Shell Action (caloing spark2-submit) it randomly fails. 

Trying to understand what is different in Oozie, how do I set this memory limit.

avatar
Expert Contributor

Hi Sunil,

 

This error is indicatig tha the containre size in yarn is set to 4.00 Gb and your spark application needs more memory to run.

 

Container Memory : yarn.nodemanager.resource.memory-mb

 

As a test you can increase the continer size in yarn configuration to say 6 Gb or 8Gb and see if the application succeeds. ( If using Cloudera manager, you will se this in CM > yarn > configuration> Container Memory

yarn.nodemanager.resource.memory-mb)

 

Regards
Bimal

avatar
Contributor

I applied this property and increased the limit to 6 GB. It still fails with exact same error message.  

avatar
Expert Contributor

Hi Sunil,

 

That means the spark submit is asking for container size of 4 Gb. The --executor-memory must be getting set to 4g.

 

Can you check the saprk command being used and set the --executor-memory and --driver-memory to 6g.

 

 

Regards
Bimal

avatar
Contributor

sorry forgot to mention.. I have been using executor memory of 14GB and driver memory of 10GB. None of my tasks spill memory to disk. this is so strange and its shaking my fundamental understanding of spark. 

I have memory overhead of 3G.

 

Again, the same setting in CDSW are used but it never failed from there. Its when I run the job in Oozie it fails. It restarts on its own and that one completes without any failures.

 

When would spark use physical memory and virtual memory?

avatar
Contributor

The oozie mapper was running out of 4GB memory. I changed that to 8GB. Now the job ran fine without restarts. 

 

<configuration>
<property>
<name>oozie.launcher.mapreduce.map.memory.mb</name>
<value>8000</value>
</property>
<property>
<name>oozie.launcher.mapreduce.map.java.opts</name>
<value>-Xmx1500m</value>
</property>
<property>
<name>oozie.launcher.yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
<property>
<name>oozie.launcher.mapreduce.map.java.opts</name>
<value>-Xmx870m</value>
</property>
</configuration>