Support Questions

Find answers, ask questions, and share your expertise

Spark displays SQLException when Hive not installed

avatar
Explorer

We have CDH 5.5.2 installed using Cloudera Manager. Hive is not installed.

 

When using spark-submit or spark-shell, we get a lot of errors apparently related to Hive, but Spark works fine.

 

Grepping out the exceptions without the long stacktraces, the errors are

Caused by: java.sql.SQLException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------

java.sql.SQLException: Failed to create database '/var/lib/hive/metastore/metastore_db', see the next exception for details.

Caused by: ERROR XJ041: Failed to create database '/var/lib/hive/metastore/metastore_db', see the next exception for details.

Caused by: ERROR XBM0H: Directory /var/lib/hive/metastore/metastore_db cannot be created.

 

The last exception is:

Caused by: ERROR XBM0H: Directory /var/lib/hive/metastore/metastore_db cannot be created.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.services.monitor.StorageFactoryService$10.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.services.monitor.StorageFactoryService.createServiceRoot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.createPersistentService(Unknown Source)
at org.apache.derby.impl.services.monitor.FileMonitor.createPersistentService(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.createPersistentService(Unknown Source)

 

Any idea how to suppress these unnecessary errors?

 

Craig

1 REPLY 1

avatar
Contributor

Hey Craig-

 

Spark's HiveContext requires the use of *some* metastore.  In this case, since you're not specifying one, it's creating the default, file-based metastore_db.

 

Here's some more details:

 

https://github.com/apache/spark/blob/99dfcedbfd4c83c7b6a343456f03e8c6e29968c5/examples/src/main/scal...

 

http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables

 

 

Few options:  

1) make sure the location is writable by your Spark processes

2) configure the hive-site.xml to place the file in a diff location

3) move to MySQL or equivalent for true metastore functionality (might be needed elsewhere)