Created on 01-22-2017 10:53 PM - edited 09-16-2022 03:55 AM
Hi, I have one cluster with 10 node for testing.The master node have 3 cores and 25 Gb memory,others have 2 cores and 13Gb memory. without rich resource.
When I submit spark program within terminal .It runs all ok.But when I run the same spark program in HUE using workflow,it failed.
I have tried increasing the container maxminux memory to 8Gb by https://community.cloudera.com/t5/Batch-Processing-and-Workflow/Oozie-sqoop-action-in-CDH-5-2-Heart-... .
But it does not work.
Here is the logs ,Any one has any good idea?:
```
>>> Invoking Spark class now >>> Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat <<< Invocation of Main class completed <<< Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Application application_1485152326646_0002 finished with failed status org.apache.spark.SparkException: Application application_1485152326646_0002 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1035) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1082) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:256) at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:207) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:49) at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:52) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:231) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Oozie Launcher failed, finishing Hadoop job gracefully
```
Created 02-02-2017 09:05 PM
The lack of Vcore caused the problem.
I found one container pending in yarn resource pool UI.And I increase the Vcore value per node.
It works
Created 02-02-2017 09:31 AM
Same problem here!
I also can run the job successfuly with spark-submit.
I noticed additionall information in yarn log when I try do do it via oozie and hue :
2017-02-02 18:06:45,276 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: hdfs is accessing unchecked http://vms4580.saipm.com:43355/ws/v1/mapreduce/jobs/job_1486053773758_0003?user.name=hue&doAs=hdfs which is the app master GUI of application_1486053773758_0003 owned by hdfs
I cannot also track the application when I run via Oozie, so it seems that it can't connect correctly to the web tracking service.
I have multihome cluster, maybe the problem is linked ?
My cluster is also small, only 2 nodes. I also read that because for small cluster each queue assign some memory size (2048MB) to complete single map reduce job. If more than one map reduce job run in single queue mean it met a deadlock. However still not working after increasing memory size and java heap size.
Created 02-02-2017 09:05 PM
The lack of Vcore caused the problem.
I found one container pending in yarn resource pool UI.And I increase the Vcore value per node.
It works
Created 02-03-2017 07:22 AM