Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar
Expert Contributor

Issue:

HSI's (Tech Preview) component LLAP start fails in kerberized setup because of missing keytabs.

When HSI is started, its component LLAP fails with below trace:

INFO impl.LlapRegistryService: Using LLAP registry (client) type: Service LlapRegistryService in state LlapRegistryService: STARTED
INFO state.ConnectionStateManager: State change: CONNECTED
ERROR impl.LlapZookeeperRegistryImpl: Unable to start curator PathChildrenCache. Exception: {}
org.apache.zookeeper.KeeperException$InvalidACLException: KeeperErrorCode = InvalidACL for /llap-sasl/user-hive
at org.apache.zookeeper.KeeperException.create(KeeperException.java:121) ~[zookeeper-3.4.6.2.5.3.0-37.jar:3.4.6-37--1]
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) ~[zookeeper-3.4.6.2.5.3.0-37.jar:3.4.6-37--1]
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) ~[zookeeper-3.4.6.2.5.3.0-37.jar:3.4.6-37--1]
at org.apache.curator.utils.ZKPaths.mkdirs(ZKPaths.java:232) ~[curator-client-2.7.1.jar:?]
at org.apache.curator.utils.EnsurePath$InitialHelper$1.call(EnsurePath.java:148) ~[curator-client-2.7.1.jar:?]
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[curator-client-2.7.1.jar:?]
at org.apache.curator.utils.EnsurePath$InitialHelper.ensure(EnsurePath.java:141) ~[curator-client-

2.7.1.jar:?]
at org.apache.curator.utils.EnsurePath.ensure(EnsurePath.java:99) ~[curator-client-2.7.1.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache.rebuild(PathChildrenCache.java:323) ~[curator-recipes-2.7.1.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache.start(PathChildrenCache.java:300) ~[curator-recipes-2.7.1.jar:?]
at org.apache.hadoop.hive.llap.registry.impl.LlapZookeeperRegistryImpl.checkPathChildrenCache(LlapZookeeperRegistryImpl.java:757) [hive-exec-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.registry.impl.LlapZookeeperRegistryImpl.getInstances(LlapZookeeperRegistryImpl.java:725) [hive-exec-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.registry.impl.LlapRegistryService.getInstances(LlapRegistryService.java:129) [hive-exec-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver.populateAppStatusFromLlapRegistry(LlapStatusServiceDriver.java:490) [hive-llap-server-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver.run(LlapStatusServiceDriver.java:245) [hive-llap-server-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver.main(LlapStatusServiceDriver.java:941) [hive-llap-server-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
ERROR cli.LlapStatusServiceDriver: FAILED: Failed to get instances from llap registry
org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver$LlapStatusCliException: Failed to get instances from llap registry
at org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver.populateAppStatusFromLlapRegistry(LlapStatusServiceDriver.java:492) [hive-llap-server-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver.run(LlapStatusServiceDriver.java:245) [hive-llap-server-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver.main(LlapStatusServiceDriver.java:941) [hive-llap-server-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
Caused by: java.io.IOException: org.apache.zookeeper.KeeperException$InvalidACLException: KeeperErrorCode = InvalidACL for /llap-sasl/user-hive
at org.apache.hadoop.hive.llap.registry.impl.LlapZookeeperRegistryImpl.checkPathChildrenCache(LlapZookeeperRegistryImpl.java:760) ~[hive-exec-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.registry.impl.LlapZookeeperRegistryImpl.getInstances(LlapZookeeperRegistryImpl.java:725) ~[hive-exec-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.registry.impl.LlapRegistryService.getInstances(LlapRegistryService.java:129) ~[hive-exec-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
at org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver.populateAppStatusFromLlapRegistry(LlapStatusServiceDriver.java:490) ~[hive-llap-server-2.1.0.2.5.3.0-37.jar:2.1.0.2.5.3.0-37]
... 2 more

This can happen in case the HSI is enabled after kerberizing the cluster.

Reason:

- This is because HSI needs 2 ketab files : 'hive.service.keytab' and 'hive.llap.zk.sm.keytab' present on all the YARN's NodeManager nodes.

- If HSI is not enabled before the cluster's kerberization, the above two keytab files will not get distributed on all the NodeManager nodes, unlike when HSI is Enabled before kerberization.

Thus, the error:

Caused by: org.apache.zookeeper.KeeperException$InvalidACLException: KeeperErrorCode = InvalidACL for /llap-sasl/user-hive  at org.apache.zookeeper.KeeperException.create(KeeperException.java:121) ~[zookeeper-3.4.6.2.5.0.0-1245.jar:3.4.6-1245--1]

because the ZK node is not created / missing.

zk: localhost:2181(CONNECTED) 3] ls /llap-sasl []
zk node is missing

Resolution:

- Regenerating keytabs from Ambari Kerberos page, will distribute the above keytab files on all NodeManager Nodes.

14678-screen-shot-2017-04-17-at-52613-pm.png

- Further, do confirm that Hive's config hive.llap.zk.sm.connectionString is updated with the list of all Zookeeper Nodes in the cluster. For example:

zk.host1.org:2181,zk.host2.org:2181,zk.host3.org:2181

The Zookeeper Nodes list ca be got from here:

14679-screen-shot-2017-04-17-at-124014-pm.png

Note to append the Port Numbers as mentioned in example.

Restart HSI to confirm the behavior.

1,879 Views