Created on 02-26-2016 07:14 AM - edited 09-16-2022 03:06 AM
We have CDH 5.5.2 installed using Cloudera Manager. Hive is not installed.
When using spark-submit or spark-shell, we get a lot of errors apparently related to Hive, but Spark works fine.
Grepping out the exceptions without the long stacktraces, the errors are
Caused by: java.sql.SQLException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: Failed to create database '/var/lib/hive/metastore/metastore_db', see the next exception for details.
Caused by: ERROR XJ041: Failed to create database '/var/lib/hive/metastore/metastore_db', see the next exception for details.
Caused by: ERROR XBM0H: Directory /var/lib/hive/metastore/metastore_db cannot be created.
The last exception is:
Caused by: ERROR XBM0H: Directory /var/lib/hive/metastore/metastore_db cannot be created.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.services.monitor.StorageFactoryService$10.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.services.monitor.StorageFactoryService.createServiceRoot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.createPersistentService(Unknown Source)
at org.apache.derby.impl.services.monitor.FileMonitor.createPersistentService(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.createPersistentService(Unknown Source)
Any idea how to suppress these unnecessary errors?
Craig
Created 02-26-2016 12:19 PM
Hey Craig-
Spark's HiveContext requires the use of *some* metastore. In this case, since you're not specifying one, it's creating the default, file-based metastore_db.
Here's some more details:
http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables
Few options:
1) make sure the location is writable by your Spark processes
2) configure the hive-site.xml to place the file in a diff location
3) move to MySQL or equivalent for true metastore functionality (might be needed elsewhere)