- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Zookeeper problem after hadoop kerberization
- Labels:
-
Apache Ambari
-
Apache Zookeeper
-
Kerberos
Created on 10-09-2017 06:12 AM - last edited on 03-26-2020 02:51 AM by VidyaSargur
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
after hadoop kerberization we are facing an issue about some services, these services don't start. These services are yarn resource manager, hbase regionservers, ambari-infra, logsearch. Problem seems same, they are all "No auth" error for related directories. Ambari-infra error;
KeeperErrorCode = NoAuth for /infra-solr org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /infra-solr at org.apache.zookeeper.KeeperException.create(KeeperException.java:113) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setACL(ZooKeeper.java:1399) at org.apache.ambari.logsearch.solr.util.AclUtils.setRecursivelyOn(AclUtils.java:77) at org.apache.ambari.logsearch.solr.commands.SecureSolrZNodeZkCommand.executeZkCommand(SecureSolrZNodeZkCommand.java:63) at org.apache.ambari.logsearch.solr.commands.SecureSolrZNodeZkCommand.executeZkCommand(SecureSolrZNodeZkCommand.java:39) at org.apache.ambari.logsearch.solr.commands.AbstractZookeeperRetryCommand.createAndProcessRequest(AbstractZookeeperRetryCommand.java:38) at org.apache.ambari.logsearch.solr.commands.AbstractRetryCommand.retry(AbstractRetryCommand.java:45) at org.apache.ambari.logsearch.solr.commands.AbstractRetryCommand.run(AbstractRetryCommand.java:40) at org.apache.ambari.logsearch.solr.AmbariSolrCloudClient.secureSolrZnode(AmbariSolrCloudClient.java:170) at org.apache.ambari.logsearch.solr.AmbariSolrCloudCLI.main(AmbariSolrCloudCLI.java:526)
Created 10-11-2017 05:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I tried a few things. Regenerated keytabs, checked ticket issues, check read accesses of keytabs...There were no problems. I also tried to delete problematic service, remove zookeeper folder(I faced with 'no authentication' error and i could removed with using super digest as explained here; https://community.hortonworks.com/articles/29900/zookeeper-using-superdigest-to-gain-full-access-to.... and added again but the problem had continued.
I resolved issue with adding security.auth_to_local rules to zokeeper environment. I added rules for problematic services to SERVER_JVMFLAGS in zookeeper-env template like this and restart zookeeper and other related services.
-Dzookeeper.security.auth_to_local=RULE:[2:\$1@\$0](hbase@MY_REALM)s/.*/hbase/RULE:[2:\$1@\$0](infra-solr@MY_REALM)s/.*/infra-solr/RULE:[2:\$1@\$0](rm@MY_REALM)s/.*/rm/
Created 10-09-2017 06:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
from the error, it apparently user have not authenticated with proper keytab.
these are the possible root causes / solution for the problem.
1. check you have all the service keytabs are placed in "/etc/security/keytabs" for each host.
2. verify the service user for the service have at least read access for the keytab.
3. most common issue is with naming
service keytab name & service principle name which mentioned in service configuration is not matched with keytab file .
apart from this please ensure to check you are able to get the ticket using the keytabs.
Hope this helps!!
Created 10-09-2017 09:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are a couple of things that could be wrong,first step
-re-run the Ambari UI kerberos wizard and ensure it regenerates the principals/keytabs without any error On the node where the services are running check that the keytabs were gerenerate in /etc/security/keytabs/*
On the KDC server validate that the principals were created
# kadmin.loca l kadmin.local listprincs
All the principals in question should be in the KDC database
Check that the keytabs are mapped to the correct principal.
# klist -kt /etc/security/keytabs/yarn.service.keytab Keytab name: FILE:/etc/security/keytabs/yarn.service.keytab KVNO Timestamp Principal ---- ------------------- ------------------------------------------------------ 1 08/24/2017 15:42:24 yarn/{host_FQDN}@REALM 1 08/24/2017 15:42:24 yarn/{host_FQDN}@REALM 1 08/24/2017 15:42:24 yarn/{host_FQDN}@REALM 1 08/24/2017 15:42:24 yarn/{host_FQDN}@REALM 1 08/24/2017 15:42:24 yarn/{host_FQDN}@REALM
Using the correct principal grab a kerberos ticket
# kinit -kt /etc/security/keytabs/yarn.service.keytab yarn/{host_FQDN}@REALM
Check that a valid ticket was issued
# klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: yarn/{host_FQDN}@REALM Valid starting Expires Service principal 10/09/2017 11:13:07 10/10/2017 11:13:07 krbtgt/REALM@REALM
In ambari start that particular service in the above case YARN Please revert
Created 10-11-2017 05:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I tried a few things. Regenerated keytabs, checked ticket issues, check read accesses of keytabs...There were no problems. I also tried to delete problematic service, remove zookeeper folder(I faced with 'no authentication' error and i could removed with using super digest as explained here; https://community.hortonworks.com/articles/29900/zookeeper-using-superdigest-to-gain-full-access-to.... and added again but the problem had continued.
I resolved issue with adding security.auth_to_local rules to zokeeper environment. I added rules for problematic services to SERVER_JVMFLAGS in zookeeper-env template like this and restart zookeeper and other related services.
-Dzookeeper.security.auth_to_local=RULE:[2:\$1@\$0](hbase@MY_REALM)s/.*/hbase/RULE:[2:\$1@\$0](infra-solr@MY_REALM)s/.*/infra-solr/RULE:[2:\$1@\$0](rm@MY_REALM)s/.*/rm/
Created 03-26-2020 02:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have tried above solution, it did not work.
Any idea about the issue
Thanks in Advance
Created 12-01-2020 12:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following map rule is wrong:
RULE:[2:\$1@\$0](rm@MY_REALM)s/.*/rm/
the user for the ResourceManager is not "rm" but "yarn" and this should be the replacement value. This is the same as for the hadoop.security.auth_to_local in Hadoop/HDFS configuration.