Created on 03-13-2017 06:52 AM - edited 09-16-2022 04:14 AM
Hi All,
We have many Oozie workflows in Hue which has spark actions that interacts with Hive. We have added hive-site.xml to the workflows and everything worked fine with Cloudera 5.7.1. We have just updated to Cloudera 5.10 with the newest parcels and Oozie Spark actions can’t reach Hive warehouse anymore. We tried to add hive-site.xml to the workflows, set --files hdfs://<path to hive-site.xml> at the "Options list" and set hive.metastore.uris at the properties but nothing worked. If we start these spark apps with spark-submit or with spark shell it works fine. We also tried to reach Hive warehouse from Oozie Spark action at another total different cluster (with CDH 5.10) but this bug exists there too.
We are using a Postgres database for Hive metastore.
Can anybody create a working Oozie Spark action that reach Hive with CDH 5.7 < ?
This issue comes up many times in the last few months here in Cloudera’s forum but there is no solution so any help will be very appreciated! Thanks
[main] WARN org.apache.hadoop.hive.metastore.HiveMetaStore - Retrying creating default database after error: Error creating transactional connection factory javax.jdo.JDOFatalInternalException: Error creating transactional connection factory at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:781)
....
Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "BONECP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("org.apache.derby.jdbc.EmbeddedDriver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver. at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:237) at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:110) at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:82) ... 101 more Caused by: org.datanucleus.store.rdbms.datasource.DatastoreDriverNotFoundException: The specified datastore driver ("org.apache.derby.jdbc.EmbeddedDriver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver. at org.datanucleus.store.rdbms.datasource.AbstractDataSourceFactory.loadDriver(AbstractDataSourceFactory.java:58) at org.datanucleus.store.rdbms.datasource.BoneCPDataSourceFactory.makePooledDataSource(BoneCPDataSourceFactory.java:61) at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:217) ... 103 more