Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

CDH cluster - command timed-out after 90 seconds

CDH cluster - command timed-out after 90 seconds

New Contributor

Hi,

 

We use Cloudera CDH 5.4.0 and when we deployed a Sahara cluster, we saw the following error: "Command aborted because of exception: Command timed-out after 90 seconds. Program: hdfs/hdfs.sh ["format-namenode","cluster2"]".  
After that, Sahara Ubuntu CDH cluster deployment failed. 
 
The question is: how can we adjust 90 sec timeout?
 
Thank you
 
From sahara-engine.log
2016-03-10 12:03:16.869 5178 DEBUG sahara.plugins.cdh.client.resource [-] GET got response: {
"id" : 48,
"name" : "First invoke /usr/lib/python2.7/dist-packages/sahara/plugins/cdh/client/resource.py:88
2016-03-10 12:03:17.188 5178 ERROR sahara.service.ops [-] Error during operating on cluster admin-cdh-01 (reason: Failed to Provision Hadoop Cluster: Command aborted because of exception: Command timed-out after 90 seconds
Error ID: 156d9201-afc9-4ac4-a359-eddc146dcd0f)
2016-03-10 12:03:18.169 5178 INFO sahara.utils.general [-] Cluster status has been changed: id=dd7ce8da-4aa4-4bf0-bce2-f2a7400cf123, New status=Error
 
2 REPLIES 2
Highlighted

Re: CDH cluster - command timed-out after 90 seconds

Rising Star

Hi,

 

Unfortunately, it's hard-coded in the CM server side.

I'm wondering why it takes longer than the particular timeout limit (90 seconds). What do you see in the CM agent log in that target host?

 

 

Re: CDH cluster - command timed-out after 90 seconds

New Contributor

Ubuntu has the issue on OpenStack,that is,when we do with ’sudo’ on the Ubuntu instance. it checks resolver thus can’t make resolution or wait the time to reply and at worst it going to fail with ‘timeout’.perhaps when you formatting hdfs,it needs resolver thus the failure happens as well.

 

ubuntu@sahara-test1:~$ sudo -i
sudo: unable to resolve host sahara-test1
ubuntu@sahara-test1:~$ sudo apt update
sudo: unable to resolve host sahara-test1

 

The root cause is the difference between /etc/hostname and /etc/host,that is,hostname is changed by instance name but /etc/hosts won’t be updated by the name.the the difference is happened both Ubuntu and CentOS,but the resolver issue is happened only Ubuntu.CentOS doesn’t get any error for the issue.

The solution is we could use updating the /etc/hosts with cloud-init.
This cloud-init can set the hostname and the FQDN, as well as updating /etc/hosts on the instance.

 

#cloud-config
hostname: sahara-test1
fqdn: sahara-test1.localdomain
manage_etc_hosts: true

 

for more detail,please check following the link about could-init for hostname.

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/4/html/End...

Certainly Mirantis sets the DNS test through google DNS (www.google.com 8.8.8.8 or 8.8.4.4) so the issue might happen.

And the issue should be affected sahara diskimage-create.

https://github.com/openstack/sahara-image-elements

perhaps the cloud-init should be added to post-install.d on hadoop-cloudera

https://github.com/openstack/sahara-image-elements/tree/master/elements/hadoop-cloudera