I am trying CDH5 automatic installation. I have created two ubuntu Precise 12.04 LTS desktop computers using oracle virtualbox. I have built the ubuntu virtual machines as follows
1. Bridge Networked virtual machines. ifconfig gives me the following ips
1.a 192.168.1.100 &
2. The following commands are executed on the two ubuntun machines
2.a $ sudo passwd root - to enable root access
2.b $ sudo ufw allow 7180 - to open port 7180
2.c $ sudo ufw allow 7182 - to open port 7182
2.d $ sudo ufw allow 9000 - to open port 9000
2.e $ sudo ufw allow 9001 - to open port 9001
2.f $ sudo apt-get install openssh-server - to install openssh
2.g $ sudo ufw disable - to disable the firewall completely as I couldn't progress by just opening 7180, 7182, 9000 and 9001 ports
3. Edited /etc/hosts files on both machines to have the following lines
192.168.1.100 cloudera1-VirtualBox cloudera1
192.168.1.103 cloudera2-virtualBox cloudera2
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
4. $ sudo chmod u+x cloudera-manager-installer.bin
5. $ sudo ./cloudera-manager-installer.bin and followed the on screen instructions as instructions on per http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Installat...
6. Cloudera manager installed and the firefox browser started and after a while I could access the console using http://192.168.1.100:7180/cmf/login admin/admin
7. Specified the hosts for the cloudera cluster installation.
8. cluster installation progressed on both hosts i.e 192.168.1.100 & 192.168.1.103
9. Installation failed at the heartbeat step for 192.168.1.100 but I couldn't find any errors on the details. So I clicked retry and the installation is complete second time round for this node. The installation was still progressing on second node 192.168.1.103. Finally this installation also failed at the heartbeat step. Clicked retry but no luck second time, third time as well. The following is the excerpt from the details.
>>[29/May/2014 21:49:27 +0000] 5396 MainThread agent ERROR Failed to connect to previous supervisor.
>>Traceback (most recent call last):
>> File "/usr/lib/cmf/agent/src/cmf/agent.py", line 1236, in find_or_start_supervisor
>> File "/usr/lib/cmf/agent/src/cmf/agent.py", line 1423, in get_supervisor_process_info
>> self.identifier = self.supervisor_client.supervisor.getIdentification()
>> File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__
>> return self.__send(self.__name, args)
>> File "/usr/lib/python2.7/xmlrpclib.py", line 1578, in __request
>> File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request
>> self.connection.request('POST', handler, request_body, self.headers)
>> File "/usr/lib/python2.7/httplib.py", line 958, in request
>> self._send_request(method, url, body, headers)
>> File "/usr/lib/python2.7/httplib.py", line 992, in _send_request
>> File "/usr/lib/python2.7/httplib.py", line 954, in endheaders
>> File "/usr/lib/python2.7/httplib.py", line 814, in _send_output
>> File "/usr/lib/python2.7/httplib.py", line 776, in send
>> File "/usr/lib/python2.7/httplib.py", line 757, in connect
>> self.timeout, self.source_address)
>> File "/usr/lib/python2.7/socket.py", line 571, in create_connection
>> raise err
>>error: [Errno 111] Connection refused
>>[29/May/2014 21:49:27 +0000] 5396 MainThread tmpfs INFO Reusing mounted tmpfs at /run/cloudera-scm-agent/process
10. I checked $ sudo ufw status and the result in inactive on both virtual machines
11. when I check $ sudo service cloudera-scm-server status on 192.168.1.103 it comes as unrecognized service. Run the command on 192.168.1.100, it says running.
Could you please let me know where I went wrong in installing cloudera in a clustered environment?
Thanks & Regards,