Reply
Highlighted
Explorer
Posts: 20
Registered: ‎11-25-2016

Hue - Spark job run - requirement failed error

Hi,

I have created a simple oozie workflow to run a spark-scala application. It simply converts csv files to parquet format. It works fine for both client and cluster mode when run via shell(spark-submit) but when done through oozie workflow it is throwing the following error :

APP[Test_spark_run] JOB[0000011-161124152230543-oozie-oozi-W] ACTION[0000011-161124152230543-oozie-oozi-W@spark-bd99] Launcher exception: requirement failed
java.lang.IllegalArgumentException: requirement failed
	at scala.Predef$.require(Predef.scala:221)
	at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:473)
	at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:471)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
	at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:471)
	at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:469)
	at scala.collection.immutable.List.foreach(List.scala:318)
	at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:469)
	at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:724)
	at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:143)
	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1022)
	at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1082)
	at org.apache.spark.deploy.yarn.Client.main(Client.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:256)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:207)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:49)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:52)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:231)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

 It happens when give the options list as "--driver-memory 600m --executor-memory 500m" . If i just provide the options list value as "--driver-memory 600m" , then this error doesn't appear but the job gets killed because of the memory constraint. Am I providing the options list values wrongly(viz. some delimiter missing) or is it something to do with scala/oozie version compatibiliy or otherwise.

Thanks