Created 12-28-2016 02:54 AM
16/12/28 10:45:26 WARN retry.RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.setSafeMode over null. Not retrying because try once and fail.
java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup connection for hdfs-hdpcluster@EXAMPLE.COM to bigdata013.example.com/<ip-address>:8020; Host Details : local host is: "bigdata013.example.com/<ip-address>"; destination host is: "bigdata013.example.com":8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1556) at org.apache.hadoop.ipc.Client.call(Client.java:1496) at org.apache.hadoop.ipc.Client.call(Client.java:1396) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at com.sun.proxy.$Proxy10.setSafeMode(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setSafeMode(ClientNamenodeProtocolTranslatorPB.java:711) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176) at com.sun.proxy.$Proxy11.setSafeMode(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2657) at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1340) at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1324) at org.apache.hadoop.hdfs.tools.DFSAdmin.setSafeMode(DFSAdmin.java:611) at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1916) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2107) Caused by: java.io.IOException: Couldn't setup connection for hdfs-hdpcluster@EXAMPLE.COM to bigdata013.example.com/<ip-address>:8020 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:712) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:683) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:770) at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1618) at org.apache.hadoop.ipc.Client.call(Client.java:1449) ... 20 more Caused by: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:595) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:397) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:762) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:758) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:757) ... 23 more safemode: Failed on local exception: java.io.IOException: Couldn't setup connection for hdfs-hdpcluster@EXAMPLE.COM to bigdata013.example.com/<ip-address>:8020; Host Details : local host is: "bigdata013.example.com/<ip-address>"; destination host is: "bigdata013.example.com":8020; 16/12/28 10:45:40 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 16/12/28 10:45:43 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 16/12/28 10:45:44 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 16/12/28 10:45:48 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
Created 11-02-2017 12:29 PM
Is this thread still open? i.e hasn't this problem been resolved?
Please revert
Created 12-28-2016 08:51 AM
As this issue is basically related to "GSS initiate failed", Hence can you please check if you see a valid ticket and if you are able to do a "kinit" manually?
Also are you using Sun JDK? If yes then you will have to install the JCE policies for encryption.
Please check the below link which says "Before enabling Kerberos in the cluster, you must deploy the Java Cryptography Extension (JCE) security policy files on the Ambari Server and on all hosts in the cluster."
Created 12-28-2016 08:58 AM
Yes, I have installed JCE manually. And I execute "kinit" command to test ticket, the result is OK.
I have a question that whether KDC and ambari-server are in the same host, is it OK?
Created 12-28-2016 09:02 AM
Yes, It is fine. KDC and Ambari can be co located on the same host. Or they can be remotely located as well.
I have a setup where i am running KDC and Ambari on the same host without any issue so far.
Created 12-28-2016 09:04 AM
Since how long you are facing this issue? I mean is there a recent change happened? Or any recent upgrade?
What is your HDP and Ambari Version?
Only the NameNode is failing with the mentioned "GSS initiate failed", or few other components are also falling with the same issue.
I am sure that the Hostname is correct (example: hostname -f). Still it is worth to check.
- Is this kind of issue happening on the same host "bigdata013.example.com/<ip-address>:8020" ? Is this the only host (and the components hosted in this host) are giving the "GSS initiate failed" ? Or other hosts of your cluster are also having this issue? - Worth to check the hostname & KDC Connectivity.
Created 12-28-2016 09:17 AM
No, I install HDP and ambari a minute ago. After installed, I "Enable Kerberos" and I face this issue.
HDP version: HDP-2.5.0.0
ambari version: Version 2.4.1.0
Of course, all service countered this issue.
I see your reply answer in my another question. After I install JCE, I encouter 'App Timeline Server start failed'.
The log is:
Created 12-28-2016 09:20 AM
- Please check if the "Reverse lookup" is correct or not on that host?
- Also it will be best if you can share the output of to see the kerberos related JAVA options used.
ps -ef | grep AmbariServer ps -ef | grep NameNode
- Just to verify the correct JAVA Path which has the JCE.
.
Created 12-28-2016 10:09 AM
NameNode is in safe mode, and it can not up.
Created 12-28-2016 09:24 AM
Also can you please share the output of the following command "klist -e -k /etc/security/keytabs/hdfs.headless.keytab"
to see the encryption types used by the kerberos tickets?
# klist -e -k /etc/security/keytabs/hdfs.headless.keytab Keytab name: FILE:/etc/security/keytabs/hdfs.headless.keytab KVNO Principal ---- -------------------------------------------------------------------------- 4 hdfs-JoyCluster@EXAMPLE.COM (des3-cbc-sha1) 4 hdfs-JoyCluster@EXAMPLE.COM (arcfour-hmac) 4 hdfs-JoyCluster@EXAMPLE.COM (aes128-cts-hmac-sha1-96) 4 hdfs-JoyCluster@EXAMPLE.COM (des-cbc-md5) 4 hdfs-JoyCluster@EXAMPLE.COM (aes256-cts-hmac-sha1-96)
.
Created 12-28-2016 10:10 AM
Keytab name: FILE:/etc/security/keytabs/hdfs.headless.keytab
KVNO Principal
---- --------------------------------------------------------------------------
1 hdfs-hdpcluster@EXAMPLE.COM (des3-cbc-sha1)
1 hdfs-hdpcluster@EXAMPLE.COM (aes256-cts-hmac-sha1-96)
1 hdfs-hdpcluster@EXAMPLE.COM (arcfour-hmac)
1 hdfs-hdpcluster@EXAMPLE.COM (aes128-cts-hmac-sha1-96) 1 hdfs-hdpcluster@EXAMPLE.COM (des-cbc-md5)