Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Refresh Cluster fails on Secured Cluster

avatar
Rising Star

Hi !

Im using CHD 5.4.0 with kerberos secured cluster 

 

When I run "Refresh Cluster" from Cloudera Manager, I get this message 

Failed to update refreshable configuration files in the cluster 

 

this is the stderr

 

+ chown -R : /var/run/cloudera-scm-agent/process/1056-hdfs-DATANODE-refresh
+ acquire_kerberos_tgt hdfs.keytab
+ '[' -z hdfs.keytab ']'
+ '[' -n '' ']'
+ '[' validate-writable-empty-dirs = refresh-datanode ']'
+ '[' file-operation = refresh-datanode ']'
+ '[' bootstrap = refresh-datanode ']'
+ '[' failover = refresh-datanode ']'
+ '[' transition-to-active = refresh-datanode ']'
+ '[' initializeSharedEdits = refresh-datanode ']'
+ '[' initialize-znode = refresh-datanode ']'
+ '[' format-namenode = refresh-datanode ']'
+ '[' monitor-decommission = refresh-datanode ']'
+ '[' jnSyncWait = refresh-datanode ']'
+ '[' nnRpcWait = refresh-datanode ']'
+ '[' -safemode = '' -a get = '' ']'
+ '[' monitor-upgrade = refresh-datanode ']'
+ '[' finalize-upgrade = refresh-datanode ']'
+ '[' rolling-upgrade-prepare = refresh-datanode ']'
+ '[' rolling-upgrade-finalize = refresh-datanode ']'
+ '[' nnDnLiveWait = refresh-datanode ']'
+ '[' refresh-datanode = refresh-datanode ']'
+ '[' 3 -lt 3 ']'
+ DN_ADDR=bda1node02.company.com:50020
+ /opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop-hdfs/bin/hdfs --config /var/run/cloudera-scm-agent/process/1056-hdfs-DATANODE-refresh dfsadmin -reconfig datanode bda1node02.company.com:50020 start
15/10/07 16:27:43 WARN security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
15/10/07 16:27:43 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
15/10/07 16:27:43 WARN security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
reconfig: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "bda1node02.company.com/192.168.8.2"; destination host is: "bda1node02.company.com":50020;
+ RET=255
+ '[' 255 -ne 0 ']'
+ echo 'Unable to start reconfigure task on DataNode bda1node02.company.com:50020.'
+ exit 255

 

 

 

Any thoughts?

Thank you

Ben

1 ACCEPTED SOLUTION

avatar
Rising Star

On Cloudera Manager 5.7 I was seeing same problem, but luckily I fixed it by adding this to "YARN Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml"

 

<property>

<name>security.resourcemanager-administration.protocol.acl</name>

<value>*</value>

</property>

 

If you found this helpful, please buy me beer 😉

View solution in original post

5 REPLIES 5

avatar
New Contributor

Any luck on this one? We just ran into it as well. 

avatar
Rising Star

nop

avatar
New Contributor

Spoke with a Cloudera rep. They suggested upgrading to CM 5.4.5 or later. In the mean time, the workaround was to manually kinit with the HDFS datanode keytab and kick of the refresh from the command line. 

 

It will be a while before we get to try the upgrade, but I'll post back here after we complete it. 

avatar
Rising Star

On Cloudera Manager 5.7 I was seeing same problem, but luckily I fixed it by adding this to "YARN Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml"

 

<property>

<name>security.resourcemanager-administration.protocol.acl</name>

<value>*</value>

</property>

 

If you found this helpful, please buy me beer 😉

avatar
Rising Star

Of course if you do this anyone can change resource pool settings using Cloudera Manager REST API or yarn admin command. When you get update error you can check specific user who perform the command from Cloudera Server log, but i didn't bother to check it.