I recently installed a brand new CDH 5.5.1 through Cloudera Manager in a pseudo-distributed server with 6 cores and 32 GB of RAM.
I have a spark job that works over some Hive tables that works fine when I call it through spark-submit, but it is not working through Oozie.
The error is:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Any clue on what may be missing
As I scrolled further down the error message, could see an error message.
Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "BONECP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("org.apache.derby.jdbc.EmbeddedDriver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
Once I packed my application jar with org.apache.derby jar, the application ran fine.
HiveContext Needed org.apache.derby to create ConnectionPool during the instance creation.
<dependency> <groupId>org.apache.derby</groupId> <artifactId>derby</artifactId> <version>10.10.1.1</version> </dependency>