Support Questions
Find answers, ask questions, and share your expertise

namenode and datanodes not starting after Ambari HDP installation

Highlighted

namenode and datanodes not starting after Ambari HDP installation

Explorer

Hi @Jitendra Yadav, @Sagar Shimpi and @Kuldeep Kulkarni,

We are facing this problem since last week and aren't able to start hadoop namenode & datanode services.

We are trying to install HDP on a 3 node cluster. We were able to download and install all the packages via ambari GUI but it failed to start the services in the last step of installation.

Summary of the ambari-installation:

5340-untitled.png

Checked namenode port also with these commands and it looks fine:

netstat -tulapn|grep 50070
netstat -tulapn|grep 8020
lsof -i :50070
ps -aef |grep -i namenode

*iptables and SElinux are also disabled on all the hosts.

Below is the output of the following commands for more info:

$hostname

$hostname -f

$cat /proc/sys/net/ipv4/ip_local_port_range

$ifconfig

$netstat -nlp --inet

$ls -lR /var/run/hadoop/

$cat /etc/hosts

[root@RHEL01 ~]# hostname
RHEL01
[root@RHEL01 ~]# hostname -f
RHEL01.ind.hp.com
[root@RHEL01 ~]# cat /proc/sys/net/ipv4/ip_local_port_range
32768   61000
[root@RHEL01 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr FA:16:3E:D3:E3:4D
          inet addr:192.168.52.104  Bcast:192.168.52.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fed3:e34d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1400  Metric:1
          RX packets:1890883 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1894291 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1442566771 (1.3 GiB)  TX bytes:2264258248 (2.1 GiB)


lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:2468413 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2468413 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:3084909353 (2.8 GiB)  TX bytes:3084909353 (2.8 GiB)


[root@RHEL01 ~]# netstat -nlp --inet
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      1038/rpcbind
tcp        0      0 0.0.0.0:4242                0.0.0.0:*                   LISTEN      28763/jsvc.exec
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      1303/sshd
tcp        0      0 127.0.0.1:631               0.0.0.0:*                   LISTEN      1088/cupsd
tcp        0      0 0.0.0.0:5432                0.0.0.0:*                   LISTEN      1530/postmaster
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      1321/sendmail
tcp        0      0 0.0.0.0:50010               0.0.0.0:*                   LISTEN      28228/java
tcp        0      0 0.0.0.0:50075               0.0.0.0:*                   LISTEN      28228/java
tcp        0      0 0.0.0.0:35227               0.0.0.0:*                   LISTEN      1056/rpc.statd
tcp        0      0 127.0.0.1:44605             0.0.0.0:*                   LISTEN      28228/java
tcp        0      0 0.0.0.0:8670                0.0.0.0:*                   LISTEN      6018/python
tcp        0      0 0.0.0.0:50079               0.0.0.0:*                   LISTEN      28763/jsvc.exec
tcp        0      0 0.0.0.0:2049                0.0.0.0:*                   LISTEN      28763/jsvc.exec
tcp        0      0 0.0.0.0:8010                0.0.0.0:*                   LISTEN      28228/java
udp        0      0 0.0.0.0:111                 0.0.0.0:*                               1038/rpcbind
udp        0      0 0.0.0.0:631                 0.0.0.0:*                               1088/cupsd
udp        0      0 0.0.0.0:47113               0.0.0.0:*                               1056/rpc.statd
udp        0      0 0.0.0.0:4242                0.0.0.0:*                               28763/jsvc.exec
udp        0      0 0.0.0.0:789                 0.0.0.0:*                               1038/rpcbind
udp        0      0 127.0.0.1:40                0.0.0.0:*                               28763/jsvc.exec
udp        0      0 0.0.0.0:808                 0.0.0.0:*                               1056/rpc.statd
udp        0      0 0.0.0.0:68                  0.0.0.0:*                               959/dhclient
[root@RHEL01 ~]# ls -lR /var/run/hadoop/
/var/run/hadoop/:
total 16
drwxrwxr-x. 2 hdfs   hadoop 4096 Jun 29 04:40 hdfs
drwxrwxr-x. 2 mapred hadoop 4096 Jun 27 08:18 mapreduce
drwxr-xr-x. 2 root   root   4096 Jun 29 04:40 root
drwxrwxr-x. 2 yarn   hadoop 4096 Jun 27 08:18 yarn


/var/run/hadoop/hdfs:
total 8
-rw-r--r-- 1 hdfs hadoop 6 Jun 29 04:39 hadoop-hdfs-datanode.pid
-rw-r--r-- 1 hdfs hadoop 6 Jun 29 04:40 hadoop-hdfs-namenode.pid


/var/run/hadoop/mapreduce:
total 0


/var/run/hadoop/root:
total 12
-rw-r--r-- 1 root root 6 Jun 29 04:40 hadoop-hdfs-nfs3.pid
-rw------- 1 root root 6 Jun 29 04:40 hadoop_privileged_nfs3.pid
-rw-r--r-- 1 root root 6 Jun 28 05:04 hadoop-root-namenode.pid


/var/run/hadoop/yarn:
total 0
[root@RHEL01 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6


16.181.235.174   RHEL01.ind.hp.com RHEL01
16.181.235.175   RHEL02.ind.hp.com RHEL02
16.181.235.176   RHEL03.ind.hp.com RHEL03

Attaching log files:

output for /var/log/hadoop/hdfs/hadoop-hdfs-namenode-RHEL01.out file

ulimit -a for user hdfs
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127481
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 128000
pipe size            (512 bytes, -p) 8


POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 65536
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


output for /var/log/hadoop/hdfs/hadoop-hdfs-namenode-RHEL01.log file

STARTUP_MSG:   build = git@github.com:hortonworks/hadoop.git -r 26104d8ac833884c8776473823007f176854f2eb; compiled by 'jenkins' on 2016-02-10T06:18Z
STARTUP_MSG:   java = 1.8.0_60
************************************************************/
2016-06-29 04:40:42,706 INFO  namenode.NameNode (LogAdapter.java:info(47)) - registered UNIX signal handlers for [TERM, HUP, INT]
2016-06-29 04:40:42,710 INFO  namenode.NameNode (NameNode.java:createNameNode(1559)) - createNameNode []
2016-06-29 04:40:43,083 INFO  impl.MetricsConfig (MetricsConfig.java:loadFirst(112)) - loaded properties from hadoop-metrics2.properties
2016-06-29 04:40:43,264 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(64)) - Initializing Timeline metrics sink.
2016-06-29 04:40:43,270 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(82)) - Identified hostname = rhel01.ind.hp.com, serviceName = namenode
2016-06-29 04:40:43,356 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(99)) - Collector Uri: http://rhel03.ind.hp.com:6188/ws/v1/timeline/metrics
2016-06-29 04:40:43,360 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(147)) - RPC port properties configured: {8020=client}
2016-06-29 04:40:43,367 INFO  impl.MetricsSinkAdapter (MetricsSinkAdapter.java:start(206)) - Sink timeline started
2016-06-29 04:40:43,458 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(377)) - Scheduled snapshot period at 10 second(s).
2016-06-29 04:40:43,458 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:start(192)) - NameNode metrics system started
2016-06-29 04:40:43,461 INFO  namenode.NameNode (NameNode.java:setClientNamenodeAddress(424)) - fs.defaultFS is hdfs://rhel01.ind.hp.com:8020
2016-06-29 04:40:43,461 INFO  namenode.NameNode (NameNode.java:setClientNamenodeAddress(444)) - Clients are to use rhel01.ind.hp.com:8020 to access this namenode/service.
2016-06-29 04:40:43,624 INFO  hdfs.DFSUtil (DFSUtil.java:httpServerTemplateForNNAndJN(1726)) - Starting Web-server for hdfs at: http://RHEL01.ind.hp.com:50070
2016-06-29 04:40:43,695 INFO  mortbay.log (Slf4jLog.java:info(67)) - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2016-06-29 04:40:43,704 INFO  server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(294)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2016-06-29 04:40:43,712 INFO  http.HttpRequestLog (HttpRequestLog.java:getRequestLog(80)) - Http request log for http.requests.namenode is not defined
2016-06-29 04:40:43,768 INFO  http.HttpServer2 (HttpServer2.java:addGlobalFilter(710)) - Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2016-06-29 04:40:43,771 INFO  http.HttpServer2 (HttpServer2.java:addFilter(685)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs
2016-06-29 04:40:43,771 INFO  http.HttpServer2 (HttpServer2.java:addFilter(693)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2016-06-29 04:40:43,772 INFO  http.HttpServer2 (HttpServer2.java:addFilter(693)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2016-06-29 04:40:43,812 INFO  http.HttpServer2 (NameNodeHttpServer.java:initWebHdfs(86)) - Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter)
2016-06-29 04:40:43,814 INFO  http.HttpServer2 (HttpServer2.java:addJerseyResourcePackage(609)) - addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2016-06-29 04:40:43,833 INFO  http.HttpServer2 (HttpServer2.java:start(859)) - HttpServer.start() threw a non Bind IOException
java.net.BindException: Port in use: RHEL01.ind.hp.com:50070
        at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:919)
        at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:856)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:892)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:716)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:951)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:935)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
Caused by: java.net.BindException: Cannot assign requested address

 at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
        at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:914)
        ... 8 more
2016-06-29 04:40:43,835 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NameNode metrics system...
2016-06-29 04:40:43,836 INFO  impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted.
2016-06-29 04:40:43,836 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NameNode metrics system stopped.
2016-06-29 04:40:43,836 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(607)) - NameNode metrics system shutdown complete.
2016-06-29 04:40:43,837 ERROR namenode.NameNode (NameNode.java:main(1712)) - Failed to start namenode.
java.net.BindException: Port in use: RHEL01.ind.hp.com:50070
        at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:919)
        at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:856)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:892)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:716)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:951)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:935)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
Caused by: java.net.BindException: Cannot assign requested address
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
        at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:914)
        ... 8 more
2016-06-29 04:40:43,839 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2016-06-29 04:40:43,840 INFO  namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at RHEL01.ind.hp.com/16.181.235.174
************************************************************/


output of /var/log/hadoop/hdfs/hadoop-hdfs-datanode-RHEL03.out file

ulimit -a for user hdfs
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127481
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 128000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 65536
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

output of /var/log/hadoop/hdfs/hadoop-hdfs-datanode-RHEL03.log file

STARTUP_MSG:   build = git@github.com:hortonworks/hadoop.git -r 26104d8ac833884c8776473823007f176854f2eb; compiled by 'jenkins' on 2016-02-10T06:18Z
STARTUP_MSG:   java = 1.8.0_60
************************************************************/
2016-06-27 09:05:55,706 INFO  datanode.DataNode (LogAdapter.java:info(45)) - registered UNIX signal handlers for [TERM, HUP, INT]
2016-06-27 09:05:56,508 INFO  impl.MetricsConfig (MetricsConfig.java:loadFirst(112)) - loaded properties from hadoop-metrics2.properties
2016-06-27 09:05:56,672 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(64)) - Initializing Timeline metrics sink.
2016-06-27 09:05:56,672 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(82)) - Identified hostname = rhel03.ind.hp.com, serviceName = datanode
2016-06-27 09:05:56,673 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(99)) - Collector Uri: http://rhel03.ind.hp.com:6188/ws/v1/timeline/metrics
2016-06-27 09:05:56,681 INFO  impl.MetricsSinkAdapter (MetricsSinkAdapter.java:start(206)) - Sink timeline started
2016-06-27 09:05:56,753 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(377)) - Scheduled snapshot period at 10 second(s).
2016-06-27 09:05:56,753 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:start(192)) - DataNode metrics system started
2016-06-27 09:05:56,758 INFO  datanode.BlockScanner (BlockScanner.java:<init>(172)) - Initialized block scanner with targetBytesPerSec 1048576
2016-06-27 09:05:56,760 INFO  datanode.DataNode (DataNode.java:<init>(418)) - File descriptor passing is enabled.
2016-06-27 09:05:56,760 INFO  datanode.DataNode (DataNode.java:<init>(429)) - Configured hostname is RHEL03.ind.hp.com
2016-06-27 09:05:56,765 INFO  datanode.DataNode (DataNode.java:startDataNode(1127)) - Starting DataNode with maxLockedMemory = 0
2016-06-27 09:05:56,788 INFO  datanode.DataNode (DataNode.java:initDataXceiver(921)) - Opened streaming server at /0.0.0.0:50010
2016-06-27 09:05:56,790 INFO  datanode.DataNode (DataXceiverServer.java:<init>(76)) - Balancing bandwith is 6250000 bytes/s
2016-06-27 09:05:56,790 INFO  datanode.DataNode (DataXceiverServer.java:<init>(77)) - Number threads for balancing is 5
2016-06-27 09:05:56,794 INFO  datanode.DataNode (DataXceiverServer.java:<init>(76)) - Balancing bandwith is 6250000 bytes/s
2016-06-27 09:05:56,795 INFO  datanode.DataNode (DataXceiverServer.java:<init>(77)) - Number threads for balancing is 5
2016-06-27 09:05:56,795 INFO  datanode.DataNode (DataNode.java:initDataXceiver(936)) - Listening on UNIX domain socket: /var/lib/hadoop-hdfs/dn_socket
2016-06-27 09:05:56,878 INFO  mortbay.log (Slf4jLog.java:info(67)) - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2016-06-27 09:05:56,889 INFO  server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(294)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2016-06-27 09:05:56,895 INFO  http.HttpRequestLog (HttpRequestLog.java:getRequestLog(80)) - Http request log for http.requests.datanode is not defined
2016-06-27 09:05:56,901 INFO  http.HttpServer2 (HttpServer2.java:addGlobalFilter(710)) - Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2016-06-27 09:05:56,904 INFO  http.HttpServer2 (HttpServer2.java:addFilter(685)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context datanode
2016-06-27 09:05:56,904 INFO  http.HttpServer2 (HttpServer2.java:addFilter(693)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2016-06-27 09:05:56,904 INFO  http.HttpServer2 (HttpServer2.java:addFilter(693)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2016-06-27 09:05:56,925 INFO  http.HttpServer2 (HttpServer2.java:openListeners(915)) - Jetty bound to port 47941
2016-06-27 09:05:56,925 INFO  mortbay.log (Slf4jLog.java:info(67)) - jetty-6.1.26.hwx
2016-06-27 09:05:57,134 INFO  mortbay.log (Slf4jLog.java:info(67)) - Started HttpServer2$SelectChannelConnectorWithSafeStartup@localhost:47941
2016-06-27 09:05:57,309 INFO  web.DatanodeHttpServer (DatanodeHttpServer.java:start(201)) - Listening HTTP traffic on /0.0.0.0:50075
2016-06-27 09:05:57,492 INFO  datanode.DataNode (DataNode.java:startDataNode(1144)) - dnUserName = hdfs
2016-06-27 09:05:57,492 INFO  datanode.DataNode (DataNode.java:startDataNode(1145)) - supergroup = hdfs
2016-06-27 09:05:57,999 INFO  ipc.CallQueueManager (CallQueueManager.java:<init>(56)) - Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-06-27 09:05:58,014 INFO  ipc.Server (Server.java:run(676)) - Starting Socket Reader #1 for port 8010
2016-06-27 09:05:58,039 INFO  datanode.DataNode (DataNode.java:initIpcServer(837)) - Opened IPC server at /0.0.0.0:8010
2016-06-27 09:05:58,050 INFO  datanode.DataNode (BlockPoolManager.java:refreshNamenodes(152)) - Refresh request received for nameservices: null
2016-06-27 09:05:58,069 INFO  datanode.DataNode (BlockPoolManager.java:doRefreshNamenodes(197)) - Starting BPOfferServices for nameservices: <default>

2016-06-27 09:05:58,080 INFO  datanode.DataNode (BPServiceActor.java:run(814)) - Block pool <registering> (Datanode Uuid unassigned) service to rhel01.ind.hp.com/16.181.235.174:8020 starting to offer service
2016-06-27 09:05:58,086 INFO  ipc.Server (Server.java:run(906)) - IPC Server Responder: starting
2016-06-27 09:05:58,087 INFO  ipc.Server (Server.java:run(746)) - IPC Server listener on 8010: starting
2016-06-27 09:05:59,193 INFO  ipc.Client (Client.java:handleConnectionFailure(869)) - Retrying connect to server: rhel01.ind.hp.com/16.181.235.174:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2016-06-27 09:06:00,204 INFO  ipc.Client (Client.java:handleConnectionFailure(869)) - Retrying connect to server: rhel01.ind.hp.com/16.181.235.174:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2016-06-27 09:06:01,205 INFO  ipc.Client (Client.java:handleConnectionFailure(869)) - Retrying connect to server: rhel01.ind.hp.com/16.181.235.174:8020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
15 REPLIES 15
Highlighted

Re: namenode and datanodes not starting after Ambari HDP installation

Guru

hi, you already have something running on port 50070 (namenode UI)

identify the process by

netstat -anpe | grep 50070 | grep LISTEN

and kill it before retrying.

Highlighted

Re: namenode and datanodes not starting after Ambari HDP installation

Guru

besides that hostname and hostname -f should return the same FQDN, so you might modify hostname :

[root@RHEL01 ~]# hostname RHEL01.ind.hp.com
Highlighted

Re: namenode and datanodes not starting after Ambari HDP installation

Explorer

I tried that command Laurent, but no output. tried changing the hostname as well, but it didn't work.

Highlighted

Re: namenode and datanodes not starting after Ambari HDP installation

Guru

what was the above netstat command result ?

when you have the pid, please give the output of

ps aux | grep <PID>
Highlighted

Re: namenode and datanodes not starting after Ambari HDP installation

Expert Contributor

Hi Aman,

Please click on complete and try to start all the services manually from Ambari UI. Sometime services do not start after the installation and they timeout. Please let me know if it works.

Thanks,

Manish

Highlighted

Re: namenode and datanodes not starting after Ambari HDP installation

Explorer

Hi Manish,

I tried to restart the services many times, but every time the namenode throws same error.

Highlighted

Re: namenode and datanodes not starting after Ambari HDP installation

@Aman Mundra

Please try restarting the Namenode and datanodes and then try to start the services.

Highlighted

Re: namenode and datanodes not starting after Ambari HDP installation

Rising Star

@Aman Mundra

From the NameNode log I see that the 50070 port is already in use i.e.

  • INFO http.HttpServer2 (HttpServer2.java:start(859)) - HttpServer.start() threw a non Bind IOException
  • java.net.BindException: Port in use: RHEL01.ind.hp.com:50070
  • As the port 50070 is not free for use, NameNode is not starting.

    Now, we need to identify which process is holding the port 50070. We can check this via "netstat".

    Can you please share the output of following:

    # netstat -apln | grep 50070

    # jps -lv

    We can kill the process which is holding the port 50070, but it will be good to first identify which process/application is using the port. With this we can check the configuration of the application to verify if it is also using port 50070.

    As an alternative you can try to use a different port for NameNode, for example - 50080 and try to see if the NameNode process comes up after making this configuration changes.

    Highlighted

    Re: namenode and datanodes not starting after Ambari HDP installation

    Explorer

    Hi Ravi,

    I executed the commands suggested by you, but no result. Seems like the port is not used by any of the services.

    Btw I'm installing hadoop on a 3-node cluster which is hosted in a cloud environment.

    Does it makes any difference as I read somewhere that it may be caused because of an IP conflict or misconfigured /etc/hosts file as well. I've attached the contents of my hosts file above, please check once if everything is alright. thanks