Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

In HDP3, Not able to create Hive DB/Table using PySpark.

In HDP3, Not able to create Hive DB/Table using PySpark.

New Contributor

In HDP3, I am trying to ingest DB and File copy using PySaprk. After the job completion hive db is not listed in hive cli. After checking the property in Ambari came to know there is spark warehouse directory where all the hive tables are created and it is not listed under hive databases.

spark.sql.warehouse.dir=/apps/spark/warehouse

So i changed above property to hive warehouse

spark.sql.warehouse.dir=/warehouse/tablespace/managed/hive

After changing the spark warehouse property i rerun the spark job and tried to list from hive-cli show databases, no luck. When i tried to check the hdfs location /warehouse/tablespace/managed/hive i can see all the db names are created as.db files.

how to resolve this issue, Please help.

5 REPLIES 5

Re: In HDP3, Not able to create Hive DB/Table using PySpark.

Super Collaborator

Hi Giridharan,

you need to use hive warehouse connector to connect hive databases from HDP3 on words.

please see https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_hivewarehouse...

Re: In HDP3, Not able to create Hive DB/Table using PySpark.

Super Collaborator
Highlighted

Re: In HDP3, Not able to create Hive DB/Table using PySpark.

New Contributor

Hello @subhash parise.. I am still facing issue in connecting hive warehouse.. PFB error...


>>> from pyspark_llap import HiveWarehouseSession

>>> hive=HiveWarehouseSession.session(spark).build()

>>> hive.showDatabases().show(100)

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/tmp/spark-ac3ac8d8-5b24-4339-9ef6-e6eed46932cf/userFiles-29c83c49-13b6-4e28-b902-5a6cfbdf7ada/pyspark_hwc-1.0.0.3.0.0.0-1634.zip/pyspark_llap/sql/session.py", line 127, in showDatabases

File "/usr/local/lib/python3.7/site-packages/pyspark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__

File "/usr/local/lib/python3.7/site-packages/pyspark/sql/utils.py", line 63, in deco

return f(*a, **kw)

File "/usr/local/lib/python3.7/site-packages/pyspark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value

py4j.protocol.Py4JJavaError: An error occurred while calling o36.showDatabases.

: java.lang.RuntimeException: shadehive.org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [anonymous] does not have [USE] privilege on [default]




Re: In HDP3, Not able to create Hive DB/Table using PySpark.

New Contributor

@subhash parise - Thanks for the answer..

as mentioned in the url https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_hivewarehouse... I have added custom spark2-default properties and rerun the jobs and the job succeeded but i am not seeing any databases created from hive-cli show databases..

Re: In HDP3, Not able to create Hive DB/Table using PySpark.

Super Collaborator