Support Questions

Find answers, ask questions, and share your expertise

Oozie Launcher failed after repeating Heart beat Heart beat..

avatar
Contributor

Hi, I have one cluster with 10 node for testing.The master node have 3 cores and 25 Gb memory,others have 2 cores and 13Gb memory. without rich resource.

 

When I submit spark program within terminal .It runs all ok.But when I run the same spark program in HUE  using workflow,it failed.

I have tried increasing the container maxminux memory to 8Gb by https://community.cloudera.com/t5/Batch-Processing-and-Workflow/Oozie-sqoop-action-in-CDH-5-2-Heart-... .

But it does not work.

Here is the  logs ,Any one has any good idea?:

 

```

>>> Invoking Spark class now >>>

Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat

<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Application application_1485152326646_0002 finished with failed status
org.apache.spark.SparkException: Application application_1485152326646_0002 finished with failed status
	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1035)
	at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1082)
	at org.apache.spark.deploy.yarn.Client.main(Client.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:256)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:207)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:49)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:52)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:231)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Oozie Launcher failed, finishing Hadoop job gracefully

 

```

1 ACCEPTED SOLUTION

avatar
Contributor

The lack of Vcore caused the problem.

I found one container pending in yarn resource pool UI.And I increase the Vcore value per node.

It works

View solution in original post

3 REPLIES 3

avatar
Explorer

Same problem here!

 

I also can run the job successfuly with spark-submit.

 

I noticed additionall information in yarn log when I try do do it via oozie and hue : 

2017-02-02 18:06:45,276 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: hdfs is accessing unchecked http://vms4580.saipm.com:43355/ws/v1/mapreduce/jobs/job_1486053773758_0003?user.name=hue&doAs=hdfs which is the app master GUI of application_1486053773758_0003 owned by hdfs

 

I cannot also track the application when I run via Oozie, so it seems that it can't connect correctly to the web tracking service.

I have multihome cluster, maybe the problem is linked ?

 

My cluster is also small, only 2 nodes. I also read that because for small cluster each queue assign some memory size (2048MB) to complete single map reduce job. If more than one map reduce job run in single queue mean it met a deadlock. However still not working after increasing memory size and java heap size.

avatar
Contributor

The lack of Vcore caused the problem.

I found one container pending in yarn resource pool UI.And I increase the Vcore value per node.

It works

avatar
Explorer
Thx ! This also worked for me