Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar

Environment: HDP 2.4.3 , Ambari 2.4.0

SYMPTOMS: Region server logs are as follows:-

2016-10-03 15:13:55,611 INFO  [main] regionserver.HRegionServer: STOPPED: Unexpected exception during initialization, aborting2016-10-03 15:13:55,649 ERROR [main] token.AuthenticationTokenSecretManager: Zookeeper initialization failedorg.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hbase-secure/tokenauth/keys
at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:575)at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:554)

Zookeeper logs:-

2016-10-04 15:48:45,702 - ERROR [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:SaslServerCallbackHandler@137] - Failed to set name based on Kerberos authentication rules.
org.apache.zookeeper.server.auth.KerberosName$NoMatchingRule: No rules applied to hbase/345.example.net@EXAMPLE.NET  
at org.apache.zookeeper.server.auth.KerberosName.getShortName(KerberosName.java:402)  
at org.apache.zookeeper.server.auth.SaslServerCallbackHandler.handleAuthorizeCallback(SaslServerCallbackHandler.java:127)  
at org.apache.zookeeper.server.auth.SaslServerCallbackHandler.handle(SaslServerCallbackHandler.java:83)  
at com.sun.security.sasl.gsskerb.GssKrb5Server.doHandshake2(GssKrb5Server.java:317)

ACL entries in Zookeeper servers:-

123.example.net:2181(CONNECTED) 0] getAcl /hbase-secure 
'world,'anyone 
: r 
'sasl,’hbase/345.example.net@EXAMPLE.NET 
: cdrwa 
'sasl,'hbase/345.example.net@EXAMPLE.NET
: cdrwa 

ROOT CAUSE: Ideally ACLs should not be defined along with hostnames as part of principal as it may cause issues when another node takes role as master or during rolling restart of services. In this case, it was set such a way because of a bug in Ambari (AMBARI-18528) which mangled translation based on zookeeper.security.auth_to_local in zookeeper-env.sh. Please go through this bug to get the required workaround and other details. (adding back slash in front of dollar sign in the respective rule)

But why was authentication failing despite a kinit using exactly same principal as defined in Zookeeper ACL ? The answer lies in this setting in zoo.cfg:-

kerberos.removeHostFromPrincipal=true
kerberos.removeRealmFromPrincipal=true

These two settings ensure that every authenticated principal for zookeeper is stripped off its hostname as well as realm and only a short name is used by Zookeeper server. But tricky part is, this does not apply to setAcl API.

SOLUTION: Please note that our regular “rmr” command to delete HBase znode would fail with “Authentication is not valid” errors. Thus we need few alternatives, one such method is this link . Also try using Java system property zookeeper.skipACL=true in zookeeper env.sh file.

However if this does not work, we need to delete existing znode through some forceful methods such as stopping HBase and deleting entire zookeeper data directory, however please take this step with utmost caution and only if no other service is dependent on zookeeper.

Once the HBase znodes have been deleted, use the workaround given in AMBARI-18528 to populate correct ACL entries and finally start HBase.

3,675 Views
0 Kudos