Created 07-20-2017 08:48 AM
I'm using HDP2.6. where is installed oozie 4.2. and Spark2. After I tracked Hortonworks guide on this site: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/content/ch_oozie-s... for adding libs for Spark2 in 4.2. version of Oozie. After I submit the job with this add-on: <code>oozie.action.sharelib.for.spark=spark2 The error I'm getting is this: <code> 2017-07-19 12:36:53,271 WARN SparkActionExecutor:523 - SERVER[] USER[admin] GROUP[-] TOKEN[] APP[Workflow2] JOB[0000012-170717153234639-oozie-oozi-W] ACTION[0000012-170717153234639-oozie-oozi-W@spark_1] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Attempt to add (hdfs://:8020/user/oozie/share/lib/lib_20170613110051/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache. 2017-07-19 12:36:53,275 WARN SparkActionExecutor:523 - SERVER[] USER[admin] GROUP[-] TOKEN[] APP[Workflow2] JOB[0000012-170717153234639-oozie-oozi-W] ACTION[0000012-170717153234639-oozie-oozi-W@spark_1] Launcher exception: Attempt to add (hdfs://:8020/user/oozie/share/lib/lib_20170613110051/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache. java.lang.IllegalArgumentException: Attempt to add (hdfs://:8020/user/oozie/share/lib/lib_20170613110051/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache. at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13$anonfun$apply$8.apply(Client.scala:629) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13$anonfun$apply$8.apply(Client.scala:620) at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13.apply(Client.scala:620) at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13.apply(Client.scala:619) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:619) at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:892) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171) at org.apache.spark.deploy.yarn.Client.run(Client.scala:1228) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1287) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:745) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:311) at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:232) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58) at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:239) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164) I have read that new Spark2 will not work with Spark 2.1 (via oozie anyway) due to a change in how Spark handles multiple files found in distributed cache, as mentioned here: see here Keep in mind that I'm using Ambari and HDP2.6. How can I deal with this? @Tom Shields |
Created 09-14-2017 07:21 PM
Hi tom
I have the same issue... Did you finally achieve to make oozie works with spark2 ?
Régis
Created 06-16-2018 09:42 AM
I'm having exactly the same issue you describe - have you managed to solve this yet?
Created 06-16-2018 12:44 PM
@James Porritt @easyoups @Ivan Majnaric The above error occurs because the same jar files exists in both oozie and spark2 directories inside the oozie share lib in hdfs.
Follow the next HC Article to resolve this problem
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 06-17-2018 08:55 AM
https://issues.apache.org/jira/browse/OOZIE-2787 - This is the BUG id which you are hitting. To get rid of this error you have to ensure that duplicate jar file should not be present under oozie.libpath, oozie share lib and spark share lib directories.