Created on 03-03-2016 11:05 AM - edited 09-16-2022 03:07 AM
Hello everyone,
I did a cdh 5.6 install using the cloudera manager installer bin on a four node vm cluster. I was able to bring up the hdfs service and the name node role instance and 3 data node role instance. But the data node is having connectivity issue with the nama node. So Name node sees the data nodes as dead and shows 0 bytes available in the cluster.
When I looked at the datanode logs, I find the following error:
Problem connecting to server: devhdp01.xyz.int/172.16.67.184:8022
Block pool ID needed, but service not yet registered with NN java.lang.Exception: trace at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:171) at org.apache.hadoop.hdfs.server.datanode.DataNode.getNamenodeAddresses(DataNode.java:2698) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:193) at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:175) at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:117) at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:54) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) at org.apache.hadoop.jmx.JMXJsonServlet.writeAttribute(JMXJsonServlet.java:346) at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:324) at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:217) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1286) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
This error is repeated I guess every time it tries to heart beat.
I checked the netstat on the name node host. It is listening. I am not sure why the data node log shows problem connecting to this IP and post.
tcp 0 0 172.16.67.184:8022 0.0.0.0:* LISTEN 989 6824028 -
Some of the blog post I read asked to check if the /etc/hosts has a mapping for the host name to the loop back address. Here is the content of my hosts file on the master. I commented both the lines and restarted hdfs through Cloudera manager. But, didnt fix the issue.
#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
ANy help is very appreciated. I have been sitting with this issue for a long time now. Looks like I am missing something simple but couldnt figure out. Thank you!
UPDATE:
I see the following exception in the name node:
PriviledgedActionException as:cloudera-scm (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Access denied for user cloudera-scm. Superuser privilege is required
IPC Server handler 8 on 8022, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.versionRequest from 172.16.67.187:34353 Call#596 Retry#0 org.apache.hadoop.security.AccessControlException: Access denied for user cloudera-scm. Superuser privilege is required at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSuperuserPrivilege(FSPermissionChecker.java:79) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:6578) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.versionRequest(NameNodeRpcServer.java:1264) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.versionRequest(DatanodeProtocolServerSideTranslatorPB.java:248) at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28861) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1707) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080
Looks like this is the root cause. Not sure why it doesnt have the privilege though. In fact, cloudera-scm has passowrd less sudo on all the hosts. ( that was a left over permission from the last installtion in single user mode)
Created 03-04-2016 07:19 AM
I did a complete uninstall, cleaned up and removed all the folders related to hadoop and cloudera. Its workign fine now. Earlier, I did a single user mode install and uninstalled it to do the default installation. Looks like there were lingering files and folders that were causing the issue.
Created 03-04-2016 07:19 AM
I did a complete uninstall, cleaned up and removed all the folders related to hadoop and cloudera. Its workign fine now. Earlier, I did a single user mode install and uninstalled it to do the default installation. Looks like there were lingering files and folders that were causing the issue.
Created 11-23-2017 09:50 AM
Does anyone has actual solution to this problem not re-installation of the cluster. I am facing the same error for one of my DataNode, it becomes unavailable after restart and throws similar exceptions/errors.
PS: I cannot re-install because its my production cluster. Looking for help.
Thanks,
Shilpa
Created 05-26-2018 03:11 PM
same problem i have now can you help please
Created 11-18-2020 12:34 AM
Did you get a solution to this issue? Would be grateful for your response.