11-20-2017 08:15 AM
I couldnot access the hive tables from pyspark shell. We are using cloudera CDH 5.13 and spark 2.2
I have list of tables in hive databases. But couldnot acess from spark shell.
Please find attached snapshot for more details.
For workaround, i checked the hive-site.xml, spark-env.sh. everything seems to be correct.
Which hive database is used by pyspark in this case and why it is different from hive? How can i point the spark to the correct Hive database and access a1, a2, a3 tables of hive.
Note : if i Create new table from spark shell, then only i can access it.
Looking forward for your expert advice.
11-21-2017 08:12 AM
This generally happens when Hive service is not enabled for the Spark2. Please ensure that you've selected the Hive Service dependency on the Spark2:
- login to CM WebUI
- go to Spark2 service
- click on the configuration tab
- in the search box type in hive
- if it's set to none, select the hive service and redeploy the client and the stale configuration.
The default database is 'default'. Are your tables (a1,a2,a3) a part of 'default' database in hive or are they created on another database? If you do a "show databases" from spark sql does it lists all the DB's or just default?