Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Database not found when querying Hive with Spark on localhost

avatar
New Contributor

Hello,

I'm trying to request Hive table using Spark/Scala code on IntelliJ. I have installed Hadoop and Hive on a virtual machine (localhost environment) and the database 'test' and the table 'employee' have been already created using the Hive console. The versions of my componants are the following :
- Spark 2.2.0
- Hive 1.1.0
- Hadoop 2.7.3

Here, the code i'm trying to run :

import org.apache.spark.sql.SparkSession

// warehouseLocation points to the default location for managed databases and tables
val warehouseLocation = "hdfs://localhost:9000/user/hive/warehouse"

val spark = SparkSession
 .builder()
 .appName("Spark Hive Local Connector")
 .config("spark.sql.warehouse.dir", warehouseLocation)
 .config("spark.master", "local")
 .enableHiveSupport()
 .getOrCreate()

spark.catalog.listDatabases().show(false)
spark.catalog.listTables().show(false)
spark.conf.getAll.mkString("\n")

import spark.implicits._
import spark.sql

sql("USE test")
sql("SELECT * FROM test.employee").show()

Hive access seems to be properly configured via the hive-site.xml file located into /usr/local/apache-hive-1.1.0-bin/conf directory (Hive have been installed with Derby).

When i'm running the code below, it shows the existing default database but it doesn't show the table that are inside default and test databases :

+-------+---------------------+-----------------------------------------+
|name   |description          |locationUri                              |
+-------+---------------------+-----------------------------------------+
|default|Default Hive database|hdfs://localhost:9000/user/hive/warehouse|
+-------+---------------------+-----------------------------------------+

18/11/20 22:32:51 INFO SparkSqlParser: Parsing command: USE test
+----+--------+-----------+---------+-----------+
|name|database|description|tableType|isTemporary|
+----+--------+-----------+---------+-----------+
+----+--------+-----------+---------+-----------+

I join you the configurations of the hive-site.xml file i'm using :

<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/usr/local/apache-hive-1.1.0-bin/metastore_db;create=true</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://localhost:9000/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.PersistenceManagerFactoryClass</name>
<value>org.datanucleus.api.jdo.JDOPersistenceManagerFactory</value>
<description>class implementing the jdo persistence</description>
</property>
</configuration>

Once the code has finished, it throws the following error :

Exception in thread "main" org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'test' not found;

I have tried many attempts and i still do not understand why i cannot access to the 'test' database. I tried to create the 'employee' table into the 'default' database but i obtained the same error. I also placed the hive-site.xml file into the /resource folder of IntelliJ and the result is the same.

What can be missing when accessing Hive ? Thank you for your contributions !

1 REPLY 1

avatar
New Contributor

Check property in spark for hive warehouse. spark.sql.warehouse.dir = /apps/hive/warehouse