Member since
11-22-2018
1
Post
0
Kudos Received
0
Solutions
11-22-2018
08:23 PM
Hello, I'm trying to request Hive table using Spark/Scala code on IntelliJ. I have installed Hadoop and Hive on a virtual machine (localhost
environment) and the database 'test' and the table 'employee' have been
already created using the Hive console. The versions of my componants
are the following : - Spark 2.2.0 - Hive 1.1.0 - Hadoop 2.7.3 Here, the code i'm trying to run : import org.apache.spark.sql.SparkSession
// warehouseLocation points to the default location for managed databases and tables
val warehouseLocation = "hdfs://localhost:9000/user/hive/warehouse"
val spark = SparkSession
.builder()
.appName("Spark Hive Local Connector")
.config("spark.sql.warehouse.dir", warehouseLocation)
.config("spark.master", "local")
.enableHiveSupport()
.getOrCreate()
spark.catalog.listDatabases().show(false)
spark.catalog.listTables().show(false)
spark.conf.getAll.mkString("\n")
import spark.implicits._
import spark.sql
sql("USE test")
sql("SELECT * FROM test.employee").show() Hive access seems to be properly configured via the hive-site.xml file located into /usr/local/apache-hive-1.1.0-bin/conf directory (Hive have been installed with Derby). When i'm running the code below, it shows the existing default database but it doesn't show the table that are inside default and test databases : +-------+---------------------+-----------------------------------------+
|name |description |locationUri |
+-------+---------------------+-----------------------------------------+
|default|Default Hive database|hdfs://localhost:9000/user/hive/warehouse|
+-------+---------------------+-----------------------------------------+
18/11/20 22:32:51 INFO SparkSqlParser: Parsing command: USE test
+----+--------+-----------+---------+-----------+
|name|database|description|tableType|isTemporary|
+----+--------+-----------+---------+-----------+
+----+--------+-----------+---------+-----------+ I join you the configurations of the hive-site.xml file i'm using : <configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/usr/local/apache-hive-1.1.0-bin/metastore_db;create=true</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://localhost:9000/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.PersistenceManagerFactoryClass</name>
<value>org.datanucleus.api.jdo.JDOPersistenceManagerFactory</value>
<description>class implementing the jdo persistence</description>
</property>
</configuration>
Once the code has finished, it throws the following error : Exception in thread "main" org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'test' not found; I have tried many attempts and i still do not understand why i cannot access to the 'test' database. I tried to create the 'employee' table into the 'default' database but i obtained the same error. I also placed the hive-site.xml file into the /resource folder of IntelliJ and the result is the same. What can be missing when accessing Hive ? Thank you for your contributions !
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark