Reply
Expert Contributor
Posts: 63
Registered: ‎11-17-2016

cloudera-scm-agent is going down again and again. No socket could be created on 9000 -- [Errno 99] C

Hi,

 

I am using Azure services to install my 3 box Cloudera Hadoop cluster. I am using Cloudera CentOS 6.7 Operating system.

 

I am using PATH A for installation, My problem is I was able to install Cloudera Manager and agent and bring up CM GUI on last friday however now cloudera-scm-agent is going down after the moment I start it on all 3 nodes.

 

The error is :

 

{panel}

[22/Nov/2016 15:05:54 +0000] 2727 MainThread _cplogging INFO [22/Nov/2016:15:05:54] ENGINE Started monitor thread '_TimeoutMonitor'.
[22/Nov/2016 15:05:54 +0000] 2727 HTTPServer Thread-2 _cplogging ERROR [22/Nov/2016:15:05:54] ENGINE Error in HTTP server: shutting down
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/CherryPy-3.2.2-py2.6.egg/cherrypy/process/servers.py", line 187, in _start_http_thread
self.httpserver.start()
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/CherryPy-3.2.2-py2.6.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1825, in start
raise socket.error(msg)
error: No socket could be created on ('lnxdatanode1.centralus.cloudapp.azure.com', 9000) -- [Errno 99] Cannot assign requested address

[22/Nov/2016 15:05:54 +0000] 2727 HTTPServer Thread-2 _cplogging INFO [22/Nov/2016:15:05:54] ENGINE Bus STOPPING
[22/Nov/2016 15:05:54 +0000] 2727 HTTPServer Thread-2 _cplogging INFO [22/Nov/2016:15:05:54] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('lnxdatanode1.centralus.cloudapp.azure.com', 9000)) already shut down
[22/Nov/2016 15:05:54 +0000] 2727 HTTPServer Thread-2 _cplogging INFO [22/Nov/2016:15:05:54] ENGINE Stopped thread '_TimeoutMonitor'.
[22/Nov/2016 15:05:54 +0000] 2727 HTTPServer Thread-2 _cplogging INFO [22/Nov/2016:15:05:54] ENGINE Bus STOPPED
[22/Nov/2016 15:05:54 +0000] 2727 HTTPServer Thread-2 _cplogging INFO [22/Nov/2016:15:05:54] ENGINE Bus EXITING

{panel}

 

No process is binding on port 7182, 9000 or 9001

 

[root@LnxDataNode1 cloudera-scm-agent]# lsof -i:9000
[root@LnxDataNode1 cloudera-scm-agent]# lsof -i:9001
[root@LnxDataNode1 cloudera-scm-agent]# lsof -i:7182

 

My iptables is off

[root@LnxDataNode1 cloudera-scm-agent]# chkconfig iptables --list
iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off

 

If I run following Python command i get the following output:

 

[root@LnxDataNode1 cloudera-scm-agent]# python -m SimpleHTTPServer 9000

Serving HTTP on 0.0.0.0 port 9000 ...

 

[root@LnxDataNode1 cloudera-scm-agent]# python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'

lnxdatanode1.centralus.cloudapp.azure.com 40.77.28.5

 

But I see one issue reverse lookup is not working, could this be an issue?

 

{panel}

[root@LnxDataNode1 cloudera-scm-agent]# host lnxdatanode1.centralus.cloudapp.azure.com
lnxdatanode1.centralus.cloudapp.azure.com has address 40.77.28.5

[root@LnxDataNode1 cloudera-scm-agent]# hostname --fqdn
lnxdatanode1.centralus.cloudapp.azure.com


[root@LnxDataNode1 cloudera-scm-agent]# ping hostname
ping: unknown host hostname

 

[root@LnxDataNode1 cloudera-scm-agent]# hostname
LnxDataNode1

{panel}

 

 

my /etc/hosts looks like this:

 

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1 localdomain localhost
40.77.28.5 lnxdatanode1.centralus.cloudapp.azure.com lnxdatanode1

 

Please help, I am running out of time. I have to install hadoop ecosystem asap. :(

 

Thanks,

Shilpa

Highlighted
Expert Contributor
Posts: 63
Registered: ‎11-17-2016

Re: cloudera-scm-agent is going down again and again. No socket could be created on 9000 -- [Errno 9

Guys, I seriously need help. Please guide me to resolve this issue.

ports.PNG

 

PS : As you can see above I have already opened this ports from Azure for all 3 servers.

Expert Contributor
Posts: 63
Registered: ‎11-17-2016

Re: cloudera-scm-agent is going down again and again. No socket could be created on 9000 -- [Errno 9

I checked using netcat and saw none of the ports were open on all 3 nodes. though i opened them from Azure.

[root@LnxDataNode2 ~]# nc -vn 40.122.212.98 9001
nc: connect to 40.122.212.98 port 9001 (tcp) failed: Connection refused
[root@LnxDataNode2 ~]# nc -vn 40.122.212.98 9000
nc: connect to 40.122.212.98 port 9000 (tcp) failed: Connection refused
[root@LnxDataNode2 ~]# nc -vn 40.122.212.98 7182
nc: connect to 40.122.212.98 port 7182 (tcp) failed: Connection refused
[root@LnxDataNode2 ~]#

 

So, i added them in Iptables andwas able to see them open

 

[root@LnxDataNode2 ~]# iptables -A INPUT -p tcp -m multiport --dports 50070,50090,50010,7180,7182,9000,9001 -j ACCEPT
[root@LnxDataNode2 ~]# nc -vn 40.122.212.98 9000
nc: connect to 40.122.212.98 port 9000 (tcp) failed: Connection refused
[root@LnxDataNode2 ~]# /etc/init.d/iptables restart
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
iptables: Applying firewall rules: [ OK ]
[root@LnxDataNode2 ~]# nc -vn 40.122.212.98 9000
^C
[root@LnxDataNode2 ~]# nc -vn 40.122.212.98 9001
^C
[root@LnxDataNode2 ~]# nc -vn 40.122.212.98 7182
^C

 

Even after doing this, CM agent crashed the moment after the start. :(

 

Please suggest if i did the right thing.. help me 

 

Thanks in Advance!

Shilpa

Contributor
Posts: 43
Registered: ‎05-12-2016

Re: cloudera-scm-agent is going down again and again. No socket could be created on 9000 -- [Errno 9

It is definitely connected with ip/hostname on your host.

 

hostname :

yourHostname

 

hostname -f :

yourHostname

 

etc/hosts :

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
yourPrivateIP   yourHostname

 

etc/sysconfig/network :

NETWORKING=yes
HOSTNAME=yourHostname

 

Is 40.122.212.98 you private ip?

 

Expert Contributor
Posts: 63
Registered: ‎11-17-2016

Re: cloudera-scm-agent is going down again and again. No socket could be created on 9000 -- [Errno 9

no, its my public IP.

 

I changed my /etc/hosts to private IP earlier it had private IP. Now, the CM agents are running.

 

Thanks so much :)

Expert Contributor
Posts: 63
Registered: ‎11-17-2016

Re: cloudera-scm-agent is going down again and again. No socket could be created on 9000 -- [Errno 9

Plus, I fixed my reverse lookup issue which was not working earlier.

Contributor
Posts: 31
Registered: ‎03-02-2017

Re: cloudera-scm-agent is going down again and again. No socket could be created on 9000 -- [Errno 9

Hi

 

I am having same issue. I have public IP in /etc/hosts

Can you suggest how to resolve it

Expert Contributor
Posts: 63
Registered: ‎11-17-2016

Re: cloudera-scm-agent is going down again and again. No socket could be created on 9000 -- [Errno 9

[ Edited ]

Change it to Private IP and not Public IP. Also, can you check if lookup and reverse lookup is working fine. Also, if you type hostname on command line, do you get the FQDN

Announcements