Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Failed to add new Host through Cloudera Manager

avatar
New Contributor

Hello,

We're running RedHat 6.4 on 2 of our nodes.
We've installed the new Cloudera Manager 5.5.0 and we've been trying to create a cluster and add a first node to it (node is initially clean of any Cloudera component). Unfortunately, during the cluster installation, Cloudera Manager gets stuck every time at :

 

Installation failed. Failed to receive heartbeat from agent.
Ensure that the host's hostname is configured properly.
Ensure that port 7182 is accessible on the Cloudera Manager Server (check firewall rules).
Ensure that ports 9000 and 9001 are not in use on the host being added.
Check agent logs in /var/log/cloudera-scm-agent/ on the host being added. (Some of the logs can be found in the installation details).
If Use TLS Encryption for Agents is enabled in Cloudera Manager (Administration -> Settings -> Security), ensure that /etc/cloudera-scm-agent/config.ini has use_tls=1 on the host being added. Restart the corresponding agent and click the Retry link here.

 

We looked around and saw how this is usually caused by a misconfigured /etc/hosts file. So we edited ours on both Cloudera Manager and the new node, did a service network restart as well as service cloudera-scm-server restart but it didn't work either.
Here's what the /etc/hosts file looks like :

 

127.0.0.1 localhost
10.186.80.86 domain.node2.fr.net host
10.186.80.105 domain.node1.fr.net mgrnode

 

We also tried some cleaning up before relaunching the cluster creation by deleting scm_prepare_node.* and .scm_prepare_node.lock.

We looked at service cloudera-scm-agent status on the new node after each installation fail as well, and we noticed that the service isn't running (even when we do a service restart, the result is still the same)

 

service cloudera-scm-agent start 
Starting cloudera-scm-agent: [ OK ] 
service cloudera-scm-agent status 
cloudera-scm-agent dead but pid file exists

 

Here's the agent logs on the new node side :

 

tail -f /var/log/cloudera-scm-agent/cloudera-scm-agent.log 
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Agent Logging Level: INFO 
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO No command line vars 
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Missing database jar: /usr/share/java/mysql-connector-java.jar (normal, if you're not using this database type) 
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Missing database jar: /usr/share/java/oracle-connector-java.jar (normal, if you're not using this database type) 
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar 
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Agent starting as pid 24529 user cloudera-scm(420) group cloudera-scm(207). 
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Because agent not running as root, all processes will run with current user. 
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent WARNING Expected mode 0751 for /var/run/cloudera-scm-agent but was 0755 
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Re-using pre-existing directory: /var/run/cloudera-scm-agent 
[30/Nov/2015 15:07:29 +0000] 24529 MainThread agent INFO Re-using pre-existing directory: /var/run/cloudera-scm-agent/cgroups


Is there anything we're doing wrong?
Thanks in advance for your help!

 

 

1 ACCEPTED SOLUTION

avatar
New Contributor

Found the solution and it's almost like you said @dice.

This time we just created the cluster with the root user (didn't check the single user mode)

 

Besides, our host had no internet access, and having created our own repository we needed to do one last step before launching the cluster creation which is importing the GPG key on the host using this command :

 

sudo rpm --import <gpg_key_path>

 

If anybody finds themselves facing the same problem, hope this helps!

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

Hi,

 

I'm unsure if your /etc/hosts is the real one, but ensure that you meet all the requirements under "Networking and Security Requirements" in the following guide.

 

http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cm_ig_cm_requiremen...

 

Also it looks you're enabling single user mode per "Because agent not running as root". Is this correct? Have you followed the guide below?

 

http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/install_singleuser_...

avatar
New Contributor

Found the solution and it's almost like you said @dice.

This time we just created the cluster with the root user (didn't check the single user mode)

 

Besides, our host had no internet access, and having created our own repository we needed to do one last step before launching the cluster creation which is importing the GPG key on the host using this command :

 

sudo rpm --import <gpg_key_path>

 

If anybody finds themselves facing the same problem, hope this helps!