I couldnot access the hive tables from pyspark shell. We are using cloudera CDH 5.13 and spark 2.2
I have list of tables in hive databases. But couldnot acess from spark shell.
Please find attached snapshot for more details.
For workaround, i checked the hive-site.xml, spark-env.sh. everything seems to be correct.
Which hive database is used by pyspark in this case and why it is different from hive? How can i point the spark to the correct Hive database and access a1, a2, a3 tables of hive.
Note : if i Create new table from spark shell, then only i can access it.
Looking forward for your expert advice.
This generally happens when Hive service is not enabled for the Spark2. Please ensure that you've selected the Hive Service dependency on the Spark2:
- login to CM WebUI
- go to Spark2 service
- click on the configuration tab
- in the search box type in hive
- if it's set to none, select the hive service and redeploy the client and the stale configuration.
The default database is 'default'. Are your tables (a1,a2,a3) a part of 'default' database in hive or are they created on another database? If you do a "show databases" from spark sql does it lists all the DB's or just default?