Hi !
Im using CHD 5.4.0 with kerberos secured cluster
When I run "Refresh Cluster" from Cloudera Manager, I get this message
Failed to update refreshable configuration files in the cluster
this is the stderr
+ chown -R : /var/run/cloudera-scm-agent/process/1056-hdfs-DATANODE-refresh
+ acquire_kerberos_tgt hdfs.keytab
+ '[' -z hdfs.keytab ']'
+ '[' -n '' ']'
+ '[' validate-writable-empty-dirs = refresh-datanode ']'
+ '[' file-operation = refresh-datanode ']'
+ '[' bootstrap = refresh-datanode ']'
+ '[' failover = refresh-datanode ']'
+ '[' transition-to-active = refresh-datanode ']'
+ '[' initializeSharedEdits = refresh-datanode ']'
+ '[' initialize-znode = refresh-datanode ']'
+ '[' format-namenode = refresh-datanode ']'
+ '[' monitor-decommission = refresh-datanode ']'
+ '[' jnSyncWait = refresh-datanode ']'
+ '[' nnRpcWait = refresh-datanode ']'
+ '[' -safemode = '' -a get = '' ']'
+ '[' monitor-upgrade = refresh-datanode ']'
+ '[' finalize-upgrade = refresh-datanode ']'
+ '[' rolling-upgrade-prepare = refresh-datanode ']'
+ '[' rolling-upgrade-finalize = refresh-datanode ']'
+ '[' nnDnLiveWait = refresh-datanode ']'
+ '[' refresh-datanode = refresh-datanode ']'
+ '[' 3 -lt 3 ']'
+ DN_ADDR=bda1node02.company.com:50020
+ /opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop-hdfs/bin/hdfs --config /var/run/cloudera-scm-agent/process/1056-hdfs-DATANODE-refresh dfsadmin -reconfig datanode bda1node02.company.com:50020 start
15/10/07 16:27:43 WARN security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
15/10/07 16:27:43 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
15/10/07 16:27:43 WARN security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
reconfig: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "bda1node02.company.com/192.168.8.2"; destination host is: "bda1node02.company.com":50020;
+ RET=255
+ '[' 255 -ne 0 ']'
+ echo 'Unable to start reconfigure task on DataNode bda1node02.company.com:50020.'
+ exit 255
Any thoughts?
Thank you
Ben
Created 05-06-2016 04:30 PM
On Cloudera Manager 5.7 I was seeing same problem, but luckily I fixed it by adding this to "YARN Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml"
<property>
<name>security.resourcemanager-administration.protocol.acl</name>
<value>*</value>
</property>
If you found this helpful, please buy me beer 😉
Created 11-11-2015 12:15 PM
Any luck on this one? We just ran into it as well.
Created 11-12-2015 09:47 PM
nop
Created 11-19-2015 11:22 AM
Spoke with a Cloudera rep. They suggested upgrading to CM 5.4.5 or later. In the mean time, the workaround was to manually kinit with the HDFS datanode keytab and kick of the refresh from the command line.
It will be a while before we get to try the upgrade, but I'll post back here after we complete it.
Created 05-06-2016 04:30 PM
On Cloudera Manager 5.7 I was seeing same problem, but luckily I fixed it by adding this to "YARN Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml"
<property>
<name>security.resourcemanager-administration.protocol.acl</name>
<value>*</value>
</property>
If you found this helpful, please buy me beer 😉
Created on 05-06-2016 04:34 PM - edited 05-06-2016 04:35 PM
Of course if you do this anyone can change resource pool settings using Cloudera Manager REST API or yarn admin command. When you get update error you can check specific user who perform the command from Cloudera Server log, but i didn't bother to check it.