Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Datanode is not connecting to namenode

avatar
New Contributor

Hello everyone,

 

I did a cdh 5.6 install using the cloudera manager installer bin on a four node vm cluster. I was able to bring up the hdfs service and the name node role instance and 3 data node role instance. But the data node is having connectivity issue with the nama node. So Name node sees the data nodes as dead and shows 0 bytes available in the cluster.

 

When I looked at the datanode logs, I find the following error:

 

Problem connecting to server: devhdp01.xyz.int/172.16.67.184:8022

 

Block pool ID needed, but service not yet registered with NN
java.lang.Exception: trace
	at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:171)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.getNamenodeAddresses(DataNode.java:2698)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
	at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:193)
	at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:175)
	at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:117)
	at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:54)
	at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
	at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
	at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
	at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
	at org.apache.hadoop.jmx.JMXJsonServlet.writeAttribute(JMXJsonServlet.java:346)
	at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:324)
	at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:217)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
	at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
	at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1286)
	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
	at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
	at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
	at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
	at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
	at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767)
	at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
	at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
	at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
	at org.mortbay.jetty.Server.handle(Server.java:326)
	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
	at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
	at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
	at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

 

This error is repeated I guess every time it tries to heart beat.

 

I checked the netstat on the name node host.  It is listening. I am not sure why the data node log shows problem connecting to this IP and post.

 

tcp 0 0 172.16.67.184:8022 0.0.0.0:* LISTEN 989 6824028 -

 

Some of the blog post I read asked to check if the /etc/hosts has a mapping for the host name to the loop back address. Here is the content of my hosts file on the master. I commented both the lines and restarted hdfs through Cloudera manager. But, didnt fix the issue.

 

#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

 

ANy help is very appreciated. I have been sitting with this issue for a long time now. Looks like I am missing something simple but couldnt figure out. Thank you!

 

 

UPDATE:

 

I see the following exception in the name node:

 

PriviledgedActionException as:cloudera-scm (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Access denied for user cloudera-scm. Superuser privilege is required

 

IPC Server handler 8 on 8022, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.versionRequest from 172.16.67.187:34353 Call#596 Retry#0
org.apache.hadoop.security.AccessControlException: Access denied for user cloudera-scm. Superuser privilege is required
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSuperuserPrivilege(FSPermissionChecker.java:79)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:6578)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.versionRequest(NameNodeRpcServer.java:1264)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.versionRequest(DatanodeProtocolServerSideTranslatorPB.java:248)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28861)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1707)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080

 Looks like this is the root cause. Not sure why it doesnt have the privilege though. In fact, cloudera-scm has passowrd less sudo on all the hosts. ( that was a left over permission from the last installtion in single user mode)

 

1 ACCEPTED SOLUTION

avatar
New Contributor

I did a complete uninstall, cleaned up and removed all the folders related to hadoop and cloudera. Its workign fine now. Earlier, I did a single user mode install and uninstalled it to do the default installation. Looks like there were lingering files and folders that were causing the issue.

View solution in original post

4 REPLIES 4

avatar
New Contributor

I did a complete uninstall, cleaned up and removed all the folders related to hadoop and cloudera. Its workign fine now. Earlier, I did a single user mode install and uninstalled it to do the default installation. Looks like there were lingering files and folders that were causing the issue.

avatar
Expert Contributor

Does anyone has actual solution to this problem not re-installation of the cluster. I am facing the same error for one of my DataNode, it becomes unavailable after restart and throws similar exceptions/errors. 

 

PS: I cannot re-install because its my production cluster. Looking for help.

 

Thanks,

Shilpa

avatar
New Contributor

same problem i have now can you help please

avatar
Explorer

Did you get a solution to this issue? Would be grateful for your response.