Reply
Highlighted
Explorer
Posts: 11
Registered: ‎04-20-2018

Hive on spark fails with exception through oozie

[ Edited ]


I am trying to run Hive with engine as spark instead of MR. The operation is a fairly simple one involving creation of a table followed by insertion into it. When i trigger the script from Hive shell, i can see the job running to completion.

 

When I schedule the same through oozie(in the same machine), the job does start but ends up failing instantly with the following error.

 

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], main() threw exception, scala/collection/Iterable

 

java.lang.NoClassDefFoundError: scala/collection/Iterable
at org.apache.hadoop.hive.ql.parse.spark.GenSparkProcContext.<init>(GenSparkProcContext.java:163)
at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.generateTaskTree(SparkCompiler.java:329)
at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:204)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10310)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:193)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:223)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:558)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1356)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1473)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1285)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1275)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:226)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:175)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:389)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:324)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:422)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:438)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:732)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:634)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:333)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:310)
.
.


Caused by: java.lang.ClassNotFoundException: scala.collection.Iterable
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 37 more

 

Upon adding the 'oozie.action.sharelib.for.spark' set to 'hive,spark' in the workflow xml, the above exception went off but threw the following error.

 

Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster


[Available ShareLib]
hive
distcp
mapreduce-streaming
spark
oozie
hcatalog
hive2
sqoop
pig
I did check if the oozie shared libraries are available in the HDFS and they are(mentioned below).

 

Please suggest if I am missing something here. Let me know if more details are needed.

Announcements