Created 09-14-2018 08:04 PM
I have installed hdp3.0 and using Spark 2.3 and Hive 3.1.
When I am trying to access hive tables though spark(pyspark/spark-shell) then I am getting below error.
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/hdp/current/spark2-client/python/pyspark/sql/session.py", line 716, in sql return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) File "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__ File "/usr/hdp/current/spark2-client/python/pyspark/sql/utils.py", line 71, in deco raise AnalysisException(s.split(': ', 1)[1], stackTrace) pyspark.sql.utils.AnalysisException: u"Database 'test' not found;"
Only default hive database is visible in Spark.
>>> spark.sql("show databases").show() +------------+ |databaseName| +------------+ | default| +------------+ >>>
Content of hive-site.xml is not exactly same in spark/conf and hive/conf dir.
-rw-r--r-- 1 hive hadoop 23600 Sep 14 09:21 /usr/hdp/current/hive-client/conf/hive-site.xml -rw-r--r-- 1 spark spark 1011 Sep 14 12:02 /etc/spark2/3.0.0.0-1634/0/hive-site.xml
I even tried initiated spark session with hive/conf/hive-site.xml, even this did not help.
pyspark --files /usr/hdp/current/hive-client/conf/hive-site.xml
Should I copy hive-site.xml file from hive-conf to spark-conf dir (or anywhere else as well)?
Or changing a property Ambari UI will work?
Created 09-15-2018 08:53 PM
The default database it was showing was the default database from Spark which has location as '/apps/spark/warehouse', not the default database of Hive.
I am able to resolve this by copying hive-site.xml from hive-conf dir to spark-conf dir.
cp /etc/hive/conf/hive-site.xml /etc/spark2/conf
Try to run this query in your metastore database, in my case it is MySQL.
mysql> SELECT NAME, DB_LOCATION_URI FROM hive.DBS;
You will see 2 default databases there, one pointing to 'spark.sql.warehouse.dir' and other to 'hive.metastore.warehouse.di'. Location will depend in what value you have for these configuration properties.
Created 09-15-2018 08:53 PM
The default database it was showing was the default database from Spark which has location as '/apps/spark/warehouse', not the default database of Hive.
I am able to resolve this by copying hive-site.xml from hive-conf dir to spark-conf dir.
cp /etc/hive/conf/hive-site.xml /etc/spark2/conf
Try to run this query in your metastore database, in my case it is MySQL.
mysql> SELECT NAME, DB_LOCATION_URI FROM hive.DBS;
You will see 2 default databases there, one pointing to 'spark.sql.warehouse.dir' and other to 'hive.metastore.warehouse.di'. Location will depend in what value you have for these configuration properties.
Created 05-06-2019 09:22 PM
Hi,
I am trying to run spark application which will need access to Hive databases. But Hive databases like FOODMART are not visible in spark session.
I did spark.sql("show databases").show(); it is not showing Foodmart database, though spark session is having enableHiveSupport.
Below i've tried:
1)
cp /etc/hive/conf/hive-site.xml /etc/spark2/conf
2)
Changed spark.sql.warehouse.dir in spark UI from /apps/spark/warehouse to /warehouse/tablespace/managed/hive
Even though it is not working.
Please let me know what configuration changes would be required to have this.
Please note - Above is working in HDP2.6.5.