Support Questions

Find answers, ask questions, and share your expertise

How to Enable Zookeeper Discovery for HiveServer2 HA

Hi,

I am trying to set up zookeeper discovery for HS2 and I am getting various connection errors when I try to connect via beeline.

The following is the connection string.

!connect jdbc:hive2://host1.com:2181,host2.com:2181,host3.com:2181;serviceDiscoveryMode=zooKeeper; zooKeeperNamespace=hiveserver2 

I am following this documentation. Can anyone confirm if there are any other steps/configs required? Is the connection string wrong perhaps? Does HS2 need to run in http mode or is binary ok?

Configuration Requirements

1. Set hive.zookeeper.quorum to the ZooKeeper ensemble (a comma separated list of ZooKeeper server host:ports running at the cluster)

2. Customize hive.zookeeper.session.timeout so that it closes the connection between the HiveServer2’s client and ZooKeeper if a heartbeat is not received within the timeout period.

3. Set hive.server2.support.dynamic.service.discovery to true

4. Set hive.server2.zookeeper.namespace to the value that you want to use as the root namespace on ZooKeeper. The default value is hiveserver2.

5. The adminstrator should ensure that the ZooKeeper service is running on the cluster, and that each HiveServer2 instance gets a unique host:port combination to bind to upon startup.

1 ACCEPTED SOLUTION

Master Collaborator

Can you try the following connection url (observe the / after the <ZOOKEEPER QUORUM>)?

jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver

Above is for binary mode, for http mode

jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver;transportMode=http;httpPath=cliservice

For secure environments you will additionally have to add the hive principal, eg.

jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver;principal=hive/_HOST@EXAMPLE.COM;transportMode=http;httpPath=cliservice

View solution in original post

13 REPLIES 13

Master Collaborator

Can you try the following connection url (observe the / after the <ZOOKEEPER QUORUM>)?

jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver

Above is for binary mode, for http mode

jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver;transportMode=http;httpPath=cliservice

For secure environments you will additionally have to add the hive principal, eg.

jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver;principal=hive/_HOST@EXAMPLE.COM;transportMode=http;httpPath=cliservice

I'll try it, thanks!

Unfortunately that seems not to work. Does it make a difference to the URL if hiveserver2 is in http transport mode? You don't have to add that in somewhere like you do in a normal beeline connection string?

Master Collaborator

Updated the answer to include the url for http mode as well as secure http mode. Beyond this there are other modes like ssl http, ldap, ldap http. For each one the URL is configured a little differently.

works perfectly now, thanks!

A minor correction: In HDP-2.3.4, the zooKeeperNamespace is called "hiveserver2", not just "hiveserver". With that fix it works great!

Cloudera Employee

With HDP 2.3.4 and later releases, you don't need to specify any additional options as it includes the changes in HIVE-11581. Just the first URL is sufficient - jdbc:hive2://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2.

See the hiveserver2 client doc

Also make sure you have the url within quotes if you are specifying it in commandline. ie, beeline -u 'hive2...'

This should work

beeline -u "jdbc:hive2://zk1:2181,zk3:2181,zk2:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_HOST@REALM”

non kerberos 


beeline -u "jdbc:hive2://zk1:2181,zk3:2181,zk2:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2

Contributor

Can we specify queue to be used instead of default queue? Like append "tez.queue.name=xyz" to the command. Will it work?

Expert Contributor

@Arti Wadhwani Do you have the answer to your question?
I'm trying to do that, connection with zookeeper discovery and specifying the tez queue but it doesn't work

New Contributor

I'm trying to run a dag with airflow 1.10.12 and HDP 3.0.0

when i run the dag it gets stuck in ```Connecting to jdbc:hive2://[Server2_FQDN]:2181,[Server1_FQDN]:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2```

when i run ```beeline -u "jdbc:hive2://[Server1_FQDN]:2181,[Server2_FQDN]:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"``` from shell, it connect to hive with no problem.
I've also made a connection like this

```
Conn Id *
hive_jdbc
-------------
Conn Type

-------------
Connection URL
jdbc:hive2://centosserver.son.ir:2181,centosclient.son.ir:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
-------------
Login
hive
-------------
Password
******
-------------
Driver Path
/usr/hdp/3.0.0.0-1634/hive/jdbc/hive-jdbc-3.1.0.3.0.0.0-1634-standalone.jar
-------------
Driver Class
org.apache.hive.jdbc.HiveDriver

```

and I'm not using kerberos
I've also added ```hive.security.authorization.sqlstd.confwhitelist.append``` in the ambari ```Custom hive-site```

```

radoop\.operation\.id|mapred\.job\.name||airflow\.ctx\.dag_id|airflow\.ctx\.task_id|airflow\.ctx\.execution_date|airflow\.ctx\.dag_run_id|airflow\.ctx\.dag_owner|airflow\.ctx\.dag_email|hive\.warehouse\.subdir\.inherit\.perms|hive\.exec\.max\.dynamic\.partitions|hive\.exec\.max\.dynamic\.partitions\.pernode|spark\.app\.name

```

any suggestions? I'm desperate, I've tried every way that i know but still nothing

@nsabharwal @agillan @msumbul1 @deepesh1 

Guru

There is no need to pass the principal name when zookeeper quorum is being used for JDBC. As long as a valid ticket is available and impersonation settings are appropriate, it will work:

[root@services RHive]# kinit -kt myuser.service.keytab myuser/services.hortonworks.com@HDP.COM
[root@services RHive]# beeline -u "jdbc:hive2://node1.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/node1.hortonworks.com@HDP.COM"
Connecting to jdbc:hive2://node1.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/node1.hortonworks.com@HDP.COM
Connected to: Apache Hive (version 1.2.1000.2.5.0.0-1245)
Driver: Hive JDBC (version 1.2.1000.2.5.0.0-1245)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.5.0.0-1245 by Apache Hive
0: jdbc:hive2://node1.hortonworks.com:2181/> !q
Closing: 0: jdbc:hive2://node1.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/node1.hortonworks.com@HDP.COM
[root@services RHive]# beeline -u "jdbc:hive2://node1.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
Connecting to jdbc:hive2://node1.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
Connected to: Apache Hive (version 1.2.1000.2.5.0.0-1245)
Driver: Hive JDBC (version 1.2.1000.2.5.0.0-1245)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.5.0.0-1245 by Apache Hive
0: jdbc:hive2://node1.hortonworks.com:2181/> !q
Closing: 0: jdbc:hive2://node1.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
[root@services RHive]# kdestroy
[root@services RHive]# beeline -u "jdbc:hive2://node1.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
Connecting to jdbc:hive2://node1.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
16/11/09 21:57:15 [main]: ERROR transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

New Contributor

Your Connection string is :

  1. !connect jdbc:hive2://host1.com:2181,host2.com:2181,host3.com:2181;serviceDiscoveryMode=zooKeeper; zooKeeperNamespace=hiveserver2

/ is missing in your connection string after zookeeper ensemble.

Correct connection string is like below.

  1. !connect jdbc:hive2://host1.com:2181,host2.com:2181,host3.com:2181/;serviceDiscoveryMode=zooKeeper; zooKeeperNamespace=hiveserver2

How to avoid issues with JDBC connection string ?

(works only in latest versions, tested in HDP 2.5 and HDP 2.6).

1)

Go to hive in Ambari ---> Summary -----> click on left arrow button ----> connection string is copied.

2)

Paste the connection string in beeline