Support Questions

cjervis · ‎11-06-2013

Hello,

I'm doing a basic cluster installation using Cloudera Standard 4 on CentOS 6.3 64 and it does not work... Here is what I'm doing:

1. I run ./cloudera-manager-installer.bin and go to http://132.207.67.11:7180/ to continue the installation.

2. Then I follow the wizard and add the host 132.207.67.11 to the cluster installation.

3. The cluster installation on this host is successful but I find it strange that the IP is changed to 127.0.0.1.

4. But anyway, I continue and install the parcels.

5. And then at the hosts inspection I get this:

Cluster Installation

Inspect hosts for correctness Run Again

Validations

	Inspector failed on the following hosts... homer.larim.polymtl.ca: IOException thrown while collecting data from host: Connection refused Inspector ran on 0 hosts.
	The inspector failed to run on all hosts.
	0 hosts are running CDH3 and 1 hosts are running CDH4.
	All checked hosts are running the same version of components.
	All managed hosts have consistent versions of Java.
	All checked Cloudera Management Daemons versions are consistent with the server.
	All checked Cloudera Management Agents versions are consistent with the server.

And from the log file cloudera-scm-agent.out

:

[06/Nov/2013 09:45:45 +0000] 21416 MainThread agent INFO SCM Agent Version: 4.7.3

[06/Nov/2013 09:45:45 +0000] 21416 MainThread agent INFO Using directory: /var/run/cloudera-scm-agent

[06/Nov/2013 09:45:45 +0000] 21416 MainThread agent INFO Using supervisor binary path: /usr/lib64/cmf/agent/src/cmf/../../build/env/bin/supervisord

[06/Nov/2013 09:45:45 +0000] 21416 MainThread agent WARNING Agent is running on 127.0.0.1 (localhost). This is a misconfiguration for multi-machine clusters. Check your hostname settings.

[06/Nov/2013 09:45:45 +0000] 21416 MainThread agent INFO Adding env vars that start with CMF_AGENT_

[06/Nov/2013 09:45:45 +0000] 21416 MainThread agent INFO Logging to /var/log/cloudera-scm-agent/cloudera-scm-agent.log

Why is it using 127.0.0.1. I'm always using FQDN name and the name resolution is made by DNS. Just the be sure, here is the content of my hosts file:

127.0.0.1 localhost.localdomain localhost

::1 homer.larim.polymtl.ca homer localhost6.localdomain6 localhost6

132.207.67.11 homer.larim.polymtl.ca homer

Anyway, I'm a bit baffled by the problem since I'm doing a vanilla installation with all the default value. Can anybody help me?

Thanks a lot...

foxz88 · ‎11-06-2013

Ok I found the solution. I modified the hosts file for the following ( remove the FQDN for localhost 😞

127.0.0.1 localhost

::1 localhost6

132.207.67.11 homer.larim.polymtl.ca homer

Edit: Removed an entry to avoid any confusion.

View solution in original post

asterix · ‎02-14-2016

Hi,

I was interested in this threat since I encountered similar issues in a simple cluster config.

I'm using CM5 on CentOS 7.2

The host inspector gave :

master1.domain; worker[1-3].domain: IOException thrown while collecting data from host: Connection refused

The point is that each hosts is using a public IP address on eth0 and a private IP address on eth1.

As you can guess, I want my cluster to use the internal IP only.

I tried several things (between each stage, I restarted cloudera-scm-agent to make sure the modification is taken into account)

1- I tried to make some modifications to my /etc/hosts to precise public FQDN for public IP ==> FAIL

2 - I tried to use /etc/cloudera-scm/agent/config.ini , listening_ip to listen ONLY on the private IP ==> FAIL

3 - I tried to use /etc/cloudera-scm/agent/config.ini , listening_hostname to listen ONLY on the hostname associated with the private interface ==> FAIL.

At this stage, I can say, Cloudera agent is listening only on private interface (lsof confirmed) bue the inspector does not seem to focus on this

4 - I shut down eth0 (public interface) to disable multiple hostnames ==> SUCCESS

At this stage, I wondered why 3 fails and 4 succeeded. I think this is due to the python script below used to detect the hostname instead of using the cloudera config file :

python -c 'import socket; \

print socket.getfqdn(), \

socket.gethostbyname(socket.getfqdn())'

This script seem to give the fqdn for eth0 first so no luck for me.

Not sure this is the solution but the trick worked for me. It could make sense if Cloudera staff review the inspector code and make sure python code know how to use the config file.

vibe · ‎01-09-2018

this fixed my problem with /etc/hosts

#internal.ip local.hostname
192.168.1.2 testserver
192.168.1.3 testnode

#public.ip public.hostname

#https://www.whatismyip.com/reverse-dns-lookup/
x.x.x.x testserver.wherever.com

hope this helps someone else.

resolution · ‎02-14-2018

try add allow port on firewalld on host, this solved my prob

$ firewall-cmd --zone=public --add-port=9000/tcp

(Centos7)

or $ service firewalld disable

Hope this may help

Cloudera Community

Support Questions

Cluster installation - The inspector failed to run on all hosts.