Support Questions

Find answers, ask questions, and share your expertise

Spark unable to connect Hive database in HDP 3.0.1

Expert Contributor

Hi Folks,

Hope all are doing well.!!!

I've upgrade HDP 2.6.5 to HDP 3.0.1.0-187 successfully. now i'm trying to connecting hive datbases using spark-shell, i'm unable to see any hive databases. Even i have copied /etc/hive/conf/hive-site.xml to /etc/spark2/conf/ and restarted spark service. After restart spark service, hive-site.xml to original xml file.

Have there any alternative solution to resolve the issue?

Kindly assist me to fix the issue.

1 ACCEPTED SOLUTION

Super Collaborator

Hi Vinay,

use the below code to connect hive and list the databases :

spark-shell --conf spark.sql.hive.hiveserver2.jdbc.url="jdbc:hive2://hiveserverip:10000/" spark.datasource.hive.warehouse.load.staging.dir="/tmp" spark.hadoop.hive.zookeeper.quorum="zookeeperquoremip:2181" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar

val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()

hive.showDatabases().show(100, false)

Reference article

https://github.com/hortonworks-spark/spark-llap/tree/master

View solution in original post

33 REPLIES 33

Expert Contributor

Hi Vinay,

From HDP 3.0 onwards, to work with hive databases you should use the HiveWarehouseConnector library.

Please refer the below documentation.

https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/integrating-hive/content/hive_configure_a_s...

Hope this helps!

Expert Contributor

Hi @Sampath Kumar

I have enabled the hive interactive query and added the properties in custom spark2-default configuration file

spark.hadoop.hive.zookeeper.quorum=sidchadoop04.test.com:2181

spark.hadoop.hive.llap.daemon.service.hosts=@llap0

spark.datasource.hive.warehouse.load.staging.dir=/tmp spark.datasource.hive.warehouse.metastoreUri=thrift://sidchadoop04.test.com:9083 spark.sql.hive.hiveserver2.jdbc.url=jdbc:hive2://sidchadoop04.test.com:10500

But still hive database is not accessible by spark-shell.

Super Collaborator

Hi @Vinay,

Please connect hiveserver2 instead of hiveserver2Interactive by using below syntax:

spark.sql.hive.hiveserver2.jdbc.url=jdbc:hive2://sidchadoop04.test.com:10000

Expert Contributor

Hi @subhash parise

I have tried same.. But still hive database is not visible.

Super Collaborator

Hi @Vinay

please use the below syntax to connect hive from spark:

spark-shell --conf spark.sql.hive.hiveserver2.jdbc.url="jdbc:hive2://************************:10000/" spark.datasource.hive.warehouse.load.staging.dir="/tmp" spark.hadoop.hive.zookeeper.quorum="ip.************:2181" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar

Refer the below document for database operations by using spark in hive:

https://github.com/hortonworks-spark/spark-llap/tree/master

Please accept the answer if it help's

Thank you.

Expert Contributor

@subhash parise

i have also tried spark-shell --conf spark.sql.hive.hiveserver2.jdbc.url="jdbc:hive2://************************:10000/" spark.datasource.hive.warehouse.load.staging.dir="/tmp" spark.hadoop.hive.zookeeper.quorum="ip.************:2181" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar

but same result. Only Default database is visible.

Mentor

@Vinay

This article explains it all Hive Warehouse Connector for accessing Apache Spark data
The Hive Warehouse Connector supports the following applications:

  • Spark shell
  • PySpark
  • The spark-submit script

Expert Contributor

Explorer

hello ,dude how do you load your spark shell in this version ,because i can't access this

Community Manager

@sattar As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question. 



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

Mentor

@Vinay

Can you share your spark code let me test if you set the parameters as above?

Is your cluster kerberized?

Expert Contributor

HI @Geoffrey Shelton Okot

Following properties i have defined in custom spark2-default configuration.

spark.hadoop.hive.zookeeper.quorum

spark.hadoop.hive.llap.daemon.service.hosts

spark.datasource.hive.warehouse.load.staging.dir

spark.datasource.hive.warehouse.metastoreUri

spark.sql.hive.hiveserver2.jdbc.url

yes we're using kerberized cluster.

Mentor

@Vinay

In a Kerberized cluster, you MUST add the below parameter

spark.sql.hive.hiveserver2.jdbc.url.principal = $hiveS2@REALM

The above principle you can either copy it from Advanced hive-site hive.server2.authentication.kerberos.principal or get by

$ klist -kt /etc/security/keytabs/hiveserver2.service.keytab

Please check the correct HS2 keytab in /etc/security/keytabs/ that should resolve the issue.

Expert Contributor

@Geoffrey Shelton Okot

I had already defined spark.sql.hive.hiveserver2.jdbc.url.principal=hive/_HOST@TEST.COM in configuration.

Mentor

@Vinay

What do you mean by "After restart spark service, hive-site.xml to original xml file." Make sure all changes are made through Ambari otherwise it will be overwritten!
Can you give latest status ...

Expert Contributor
@Geoffrey Shelton Okot

I had copied manually /etc/hive/conf/hive-site.xml to /etc/spark2/conf/ and restarted spark service. After restart /etc/spark2/conf/hive-site.xml changed to previous hive-site.xml which i had replaced.

Latest status is, still not able to see hive database by spark. even i have also added below properties in spark configuration:

spark.sql.hive.hiveserver2.jdbc.url.principal

spark.hadoop.hive.zookeeper.quorum

spark.hadoop.hive.llap.daemon.service.hosts

spark.datasource.hive.warehouse.load.staging.dir

spark.datasource.hive.warehouse.metastoreUri

spark.sql.hive.hiveserver2.jdbc.url

Mentor

@Vinay

Can you install hive,spark clients on the hive/Spark nodes?

Expert Contributor

@Geoffrey Shelton Okot

Hive and spark client has already installed on hive and spark node.

Expert Contributor

@Geoffrey Shelton Okot

Could you please confirm do we really need to enable Interactive query? because after enable Interactive query, i'm unable to start interactive query service. Below are the logs:

2019-01-02T08:36:41,455 WARN [main] cli.LlapStatusServiceDriver: Watch mode enabled and got YARN error. Retrying.. 2019-01-02T08:36:43,462 WARN [main] cli.LlapStatusServiceDriver: Watch mode enabled and got YARN error. Retrying.. 2019-01-02T08:36:45,469 WARN [main] cli.LlapStatusServiceDriver: Watch mode enabled and got YARN error. Retrying.. 2019-01-02T08:36:47,476 INFO [main] LlapStatusServiceDriverConsole: LLAP status unknown

Mentor

@Vinay

Yes, you need to enable Interactive query.

Did you follow these steps LLAP & Interactive query

Remember also to enable YARN pre-emption via YARN config

HTH

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.