Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark job stuck in running state when run via hue/oozie

Spark job stuck in running state when run via hue/oozie


I have a spark job to convert csv to parquet. And I am trying to run it via oozie workflow in hue. It is the simplest workflow comprising a single step(a spark program).

When I run it for a simple program jar(say hello spark types example), it works fine when submitted via hue.

But when I have a bigger jar(~96 MB), the job gets stuck in running state. There is no issue with the code as the same jar works perfectly with spark-submit on the very same environment, with the same conditions - running both in client mode for simplicity of debugging.

Also, normally you can view from logs if there are certain exceptions or the job is hung due to memory issues(the continuous 'heart beat' info logs) but in this case I cannot even view the logs. When i manually kill the job, still no logs are accessible, it says :

Could not find job job_1481270830724_0007.

{"RemoteException":{"exception":"NotFoundException","message":"java.lang.Exception: job, job_1481270830724_0007, is not found","javaClassName":"org.apache.hadoop.yarn.webapp.NotFoundException"}} (error 404)

Can you help letting me know what could be the issue? Is it something to do with memory allocated for hue?

Thanks in advance