Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Data node process not starting up

avatar
Rising Star

Even I have right permission my data node not starting up. Gives following error :(Connection failed: [Errno 111] Connection refused to 0.0.0.0:50010)

[root@ip pig]# cd /hadoop/hdfs/data

[root@ip hdfs]# ls -lrt

total 0

drwxr-x---. 3 hdfs hadoop 20 Apr 5 10:54 data

drwxr-xr-x. 4 hdfs hadoop 63 Apr 9 14:08 namenode

drwxr-xr-x. 3 hdfs hadoop 38 Apr 9 14:16 namesecondary

Just want to add .. last time when the everything was running i did floowing..

added the mapred user to HDFS group and given him rwx permission on / directory... can this be a reason of failure?

Just to add Not only datanode everything else is not running now.. is there any way to know what may have gone wrong?

1 ACCEPTED SOLUTION

avatar
Rising Star

Thank you all... I have got it resolved. I did 2 things..

1. for name node : I simply changed the port to some other value

2. For data-node : I found an error message in logs saying that "/" has permission 777 , cannot start data-node. I remember I had changed it manually for some other problem. Reverting it resolved the issue.

View solution in original post

13 REPLIES 13

avatar
Rising Star

@Amit Sharma

1. Confirm from Ambari that the DataNode process is down.

2. Ensure there are no stale DataNode processes running.

# ps -ef | grep datanode | grep -v grep

3. Check that no other program/service is listening on port 50010.

# netstat -anp | grep '0.0.0.0:50010'

4. Check that there are no leftover PIDs remaining. Remove the PID file if it exists.

# cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid

5. While starting the DataNode from Ambari, tail the log file to review startup messages and related errors/warnings.

# tailf /var/log/hadoop/hdfs/hadoop-hdfs-datanode-[hostname].log

avatar
Rising Star

Non of the above commands give anything suspicious... these are the log using last command.. anything suspious..?

2016-04-10 20:09:01,856 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:start(192)) - NameNode metrics system started 2016-04-10 20:09:01,858 INFO namenode.NameNode (NameNode.java:setClientNamenodeAddress(424)) - fs.defaultFS is hdfs://ip-xxx-xxx-xxx-xxx.us-west-2.compute.internal:8020 2016-04-10 20:09:01,858 INFO namenode.NameNode (NameNode.java:setClientNamenodeAddress(444)) - Clients are to use ip-xxx-xxx-xxx-xxx.us-west-2.compute.internal:8020 to access this namenode/service. 2016-04-10 20:09:02,025 INFO hdfs.DFSUtil (DFSUtil.java:httpServerTemplateForNNAndJN(1726)) - Starting Web-server for hdfs at: http://ip-xxx-xxx-xxx-xxx.us-west-2.compute.internal:50070 2016-04-10 20:09:02,072 INFO mortbay.log (Slf4jLog.java:info(67)) - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2016-04-10 20:09:02,081 INFO server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(294)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets. 2016-04-10 20:09:02,086 INFO http.HttpRequestLog (HttpRequestLog.java:getRequestLog(80)) - Http request log for http.requests.namenode is not defined 2016-04-10 20:09:02,091 INFO http.HttpServer2 (HttpServer2.java:addGlobalFilter(710)) - Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2016-04-10 20:09:02,093 INFO http.HttpServer2 (HttpServer2.java:addFilter(685)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs 2016-04-10 20:09:02,093 INFO http.HttpServer2 (HttpServer2.java:addFilter(693)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 2016-04-10 20:09:02,093 INFO http.HttpServer2 (HttpServer2.java:addFilter(693)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs 2016-04-10 20:09:02,114 INFO http.HttpServer2 (NameNodeHttpServer.java:initWebHdfs(86)) - Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter) 2016-04-10 20:09:02,116 INFO http.HttpServer2 (HttpServer2.java:addJerseyResourcePackage(609)) - addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/* 2016-04-10 20:09:02,127 INFO http.HttpServer2 (HttpServer2.java:openListeners(915)) - Jetty bound to port 50070 2016-04-10 20:09:02,128 INFO mortbay.log (Slf4jLog.java:info(67)) - jetty-6.1.26.hwx 2016-04-10 20:09:02,381 INFO mortbay.log (Slf4jLog.java:info(67)) - Started HttpServer2$SelectChannelConnectorWithSafeStartup@ip-xxx-xxx-xxx-xxx.us-west-2.compute.internal:50070 2016-04-10 20:09:02,404 WARN common.Util (Util.java:stringAsURI(56)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration. 2016-04-10 20:09:02,404 WARN common.Util (Util.java:stringAsURI(56)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration. 2016-04-10 20:09:02,404 WARN namenode.FSNamesystem (FSNamesystem.java:checkConfiguration(654)) - Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories! 2016-04-10 20:09:02,405 WARN namenode.FSNamesystem (FSNamesystem.java:checkConfiguration(659)) - Only one namespace edits storage directory (dfs.namenode.edits.dir) configured. Beware of data loss due to lack of redundant storage directories! 2016-04-10 20:09:02,410 WARN common.Util (Util.java:stringAsURI(56)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.

avatar
Rising Star

Thank you all... I have got it resolved. I did 2 things..

1. for name node : I simply changed the port to some other value

2. For data-node : I found an error message in logs saying that "/" has permission 777 , cannot start data-node. I remember I had changed it manually for some other problem. Reverting it resolved the issue.

avatar
Master Mentor

@Amit Sharma

Those commands I gave are for CentOS 6 for your output your are either running CenOS7 or Redhat7 .

Nice to know your issue was resolved . Always give the OS,Ambari,HDP or component release to make it easire for the forum to help