Member since
06-30-2020
3
Posts
0
Kudos Received
0
Solutions
10-04-2021
03:11 AM
I tried both principal=hive/_HOST and principal=hive-myCluster/_HOST and in both cases the krb5kdc.log doesn't mention any incorrect zookeeper/server.com@REALM.COM incorrect name but still outputs the same error to the command line: ``` Error: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read HiveServer2 configs from ZooKeeper (state=,code=0) ```
... View more
10-04-2021
02:21 AM
I tried adding the parameter like this: ``` /opt/hadoop/hive/bin/hive --config /etc/hive/conf.s2 --service beeline -u "jdbc:hive2://master-01.com:2181,master-02.com:2181,master-03.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=zookeeper-mycluster/_HOST@REALM.COM;sslTrustStore=/etc/ssl/certs/truststore.jks;trustStorePassword=****" ``` ...and the default zookeeper principal name is unchanged (verified in the krb5kdc.log file on the KDC: ``` Oct 04 09:15:00 master-01.com krb5kdc[6869](info): TGS_REQ (4 etypes {18 17 16 23}) 192.168.32.15: LOOKING_UP_SERVER: authtime 0, hive-mycluster/master-03.com@REALM.COM for zookeeper/master-02.com@REALM.COM, Server not found in Kerberos database ```
... View more
10-01-2021
06:59 AM
I am working on a project to enable multiple hadoop clusters to be managed on the same machines. The premise is to customize the kerberos principal names to include the name of the target cluster e.g. instead of `userX/server.com@REALM.COM` it would be `userX-clusterX/server.com@REALM.COM`. I have systematically adjusted the configuration of a test cluster so all kerberos principal names (including service principals and smoke users) follow this naming convention. Here is a sample of the KDC's `list_principals` output: ``` [root@master-01 log]# kadmin -r REALM.COM -p ****/**** -w **** "list_principals" HTTP/master-01.com@REALM.COM admin/admin@REALM.COM dn-mycluster/worker-03.com@REALM.COM hive-mycluster/master-01.com@REALM.COM jn-mycluster/master-01.com@REALM.COM krbtgt/REALM.COM@REALM.COM nn-mycluster/master-02.com@REALM.COM rangerlookup-mycluster/master-02.com@REALM.COM rm-mycluster/master-02.com@REALM.COM spark-mycluster/worker-03.com@REALM.COM zookeeper-mycluster/master-01.com@REALM.COM zookeeper-mycluster/master-02.com@REALM.COM ``` The cluster deployment works as expected for hadoop, yarn, zookeeper and ranger, but hive + beeline are failing to authenticate (though the installation finishes without issue). Beeline is unable to connect to hive because it attempts to connect to `zookeeper/server.com@REALM.COM` rather than `zookeeper-mycluster/server.com@REALM.COM`. I used the following commands to connect to the hiveserver2 via zookeeper: ``` su smoke_user kinit -kt ~/.ssh/smoke_user.principal.keytab smoke_user/master-02.com@REALM.COM /opt/hadoop/hive/bin/hive --config /etc/hive/conf.s2 --service beeline -u "jdbc:hive2://master-01.com:2181,master-02.com:2181,master-03.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sslTrustStore=/etc/ssl/certs/truststore.jks;trustStorePassword=****" ``` ... and this is the error it throws: ``` Connecting to jdbc:hive2://master-01.com:2181,master-02.com:2181,master-03.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sslTrustStore=/etc/ssl/certs/truststore.jks;trustStorePassword=**** Error: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read HiveServer2 configs from ZooKeeper (state=,code=0) ``` The /hiverserver2 znode has not been created but those of other services have been created (so I guess it's not a general problem with zookeeper but a specific one with hive + zookeeper). Suspecting a kerberos authentification problem, I saw in the `krb5kdc.log` that the incorrect zookeeper principal was being used by hive/beeline: ``` Oct 01 08:38:50 master-01.com krb5kdc[6796](info): TGS_REQ (4 etypes {18 17 16 23}) 192.168.32.15: LOOKING_UP_SERVER: authtime 0, hive-mycluster/master-03.com@REALM.COM for zookeeper/master-02.com@REALM.COM, Server not found in Kerberos database Oct 01 08:38:50 master-01.com krb5kdc[6796](info): TGS_REQ (4 etypes {18 17 16 23}) 192.168.32.15: LOOKING_UP_SERVER: authtime 0, hive-mycluster/master-03.com@REALM.COM for zookeeper/master-02.com@REALM.COM, Server not found in Kerberos database ``` The principal `zookeeper/master-02.com@REALM.COM` is being generated somewhere in the process of attempting to authenticate with the hiveserver2, but this principal is incorrect and follows a naming convention that this project has deviated from. It should match what is in the KDC for zookeeper: ``` [root@master-01 log]# kadmin -r REALM.COM -p ****/**** -w **** "list_principals" | grep zookeeper zookeeper-mycluster/master-01.com@REALM.COM zookeeper-mycluster/master-02.com@REALM.COM zookeeper-mycluster/master-03.com@REALM.COM ``` I have been pouring over the hive, zookeeper and kerberos configuration documentation without finding any parameter which would allow to set the zookeeper principal directly (would be nice if I'm wrong though). How can I force a specific zookeeper principal name to be used?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Zookeeper
-
Kerberos