Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

CDH 5 fails to install on Ubuntu 12.04

avatar
New Contributor

I am trying to install CDH5 on a Ubuntu 12.04 and the install fails each time with "unable to receive agent heartbeat". Firewalls are stopped, ips are ok. The uninstall alsio fails as "rm: cannot remove /var/run/cloudera-scm-agent/process: device or resource busy". Has any of you finish to install CDH5? Were there any more system requirements?

1 ACCEPTED SOLUTION

avatar
Expert Contributor
Make sure you have the hostnames (/etc/hosts) setup properly on all the hosts...Try to ping the host from host and see you get the reply...
Resource busy mean the process is still running kill the process and start over making sure you have the correct hostnames or ip addresses.
Em Jay

View solution in original post

4 REPLIES 4

avatar
Expert Contributor
Make sure you have the hostnames (/etc/hosts) setup properly on all the hosts...Try to ping the host from host and see you get the reply...
Resource busy mean the process is still running kill the process and start over making sure you have the correct hostnames or ip addresses.
Em Jay

avatar
Explorer

hi,

 

could you please  elaborate a bit detail about how to get this fixed?

 

I am using Centos, and able to ping all the host and 'hostname -f' return correct hostname.

 

thx

 

nidm

avatar
Master Collaborator

Specifically the networking and security discussion here (along with the requirements for everything else in the parent section of the documentation to this link).

 

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Installat...

 

Make sure the hostname is fully qualified domain name and not just hostname.

 

Make sure the hostname value is not present on the loopback (127.0.0.1) line of the /etc/hosts file

 

check if you are getting the fqdn for hostname (without the -f).  vi /etc/sysconfig/network to double check what the system has

 

python -c "import socket; print socket.getfqdn(); print socket.gethostbyname(socket.getfqdn())" 

 

and then to verify 

 

getent hosts (ip address returned).

 

The python command resolves both forward and reverse, the getent hosts (ip address returns) verifies it back to you.

 

Make sure iptables and selinux are disabled.

(/etc/sysconfig/selinux) 

 

Verify what the 'agents' have configured for the CM host in /etc/cloudera-scm-agent/config.ini

 

There are also many forum posts for CM that discuss agent issues as well to search through.

 

Todd

 

 

avatar
Explorer

Todd,

 

thanks for the details information. However, I still can't figure it out. (except to reboot the node)

 

my cluster has three nodes:

the /etc/hosts are identical as:

 

127.0.0.1   localhost.localdomain localhost
10.122.195.196   hdfs001.demai.com hdfs001
10.122.195.197   hdfs002.demai.com hdfs002
10.122.195.198   hdfs003.demai.com hdfs003

I tested hostname in the following ways:

[ptadm@hdfs001 work-demai]$ hostname
hdfs001.demai.com
[ptadm@hdfs001 work-demai]$ hostname -f
hdfs001.demai.com
[ptadm@hdfs001 work-demai]$ python -c "import socket; print socket.getfqdn(); print socket.gethostbyname(socket.getfqdn())"
hdfs001.demai.com
10.122.195.196

I turned of selinux, and tested:

$selinuxenabled && echo enabled || echo disabled

disable