28787
DISCUSSIONS
102097
MEMBERS
3161
ARTICLES
Created 12-09-2016 05:59 AM
I have a spark job to convert csv to parquet. And I am trying to run it via oozie workflow in hue. It is the simplest workflow comprising a single step(a spark program).
When I run it for a simple program jar(say hello spark types example), it works fine when submitted via hue.
But when I have a bigger jar(~96 MB), the job gets stuck in running state. There is no issue with the code as the same jar works perfectly with spark-submit on the very same environment, with the same conditions - running both in client mode for simplicity of debugging.
Also, normally you can view from logs if there are certain exceptions or the job is hung due to memory issues(the continuous 'heart beat' info logs) but in this case I cannot even view the logs. When i manually kill the job, still no logs are accessible, it says :
Could not find job job_1481270830724_0007.
{"RemoteException":{"exception":"NotFoundException","message":"java.lang.Exception: job, job_1481270830724_0007, is not found","javaClassName":"org.apache.hadoop.yarn.webapp.NotFoundException"}} (error 404)
Can you help letting me know what could be the issue? Is it something to do with memory allocated for hue?
Thanks in advance