Community Articles
Find and share helpful community-sourced technical articles.
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.
Labels (2)
Super Guru

Please follow below steps to run spark2 action via Oozie on HDP clusters.

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/ch_oozie-s...

Your Oozie job may get failed with below error because of jar conflicts between 'oozie' sharelib and 'spark2' sharelib.

Error:

2018-06-04 13:27:32,652 WARN SparkActionExecutor:523 - SERVER[XXXX] USER[XXXX] GROUP[-] TOKEN[] APP[XXXX] JOB[0000000-<XXXXX>-oozie-oozi-W] ACTION[0000000-<XXXXXX>-oozie-oozi-W@spark2] Launcher exception: Attempt to add (hdfs://XXXX/user/oozie/share/lib/lib_XXXXX/oozie/aws-java-sdk-kms-1.10.6.jar) multiple times to the distributed cache. 
java.lang.IllegalArgumentException: Attempt to add (hdfs://XXXXX/user/oozie/share/lib/lib_20170727191559/oozie/aws-java-sdk-kms-1.10.6.jar) multiple times to the distributed cache. 
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13$anonfun$apply$8.apply(Client.scala:632) 
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13$anonfun$apply$8.apply(Client.scala:623) 
at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74) 
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13.apply(Client.scala:623) 
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13.apply(Client.scala:622) 
at scala.collection.immutable.List.foreach(List.scala:381) 
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:622) 
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:895) 
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171) 
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1231) 
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1290) 
at org.apache.spark.deploy.yarn.Client.main(Client.scala) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:750) 
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) 
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) 
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) 
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:311) 
at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:232) 
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58) 
at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:62) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:237) 
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164) 

.

Please run below commands to fix this error:

Note - You may need to take backup before running rm commands.

hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/aws* 
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/azure* 
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/hadoop-aws* 
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/hadoop-azure* 
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/ok*
hadoop fs -mv /user/oozie/share/lib/lib_<ts>/oozie/jackson* /user/oozie/share/lib/lib_<ts>/oozie.old 

.

Please run below command to update Oozie sharelib:

oozie admin -oozie http://<oozie-server-hostname>:11000/oozie -sharelibupdate

.

Please comment if you have any feedback/questions/suggestions. Happy Hadooping!! :)

2,174 Views
0 Kudos
Don't have an account?
Version history
Last update:
‎06-08-2018 12:09 AM
Updated by:
Contributors
Top Kudoed Authors