Created 02-06-2018 02:04 AM
Hi,
Getting frequent alerts on HDFS health test. Canary test failed to write file in directory /tmp/.cloudera_health_monitoring_canary_files. Checked the permissions, everything seems to be good and I could see files being created as well.
Created 02-06-2018 09:04 AM
Check the logs for more information:
Created 02-06-2018 09:37 AM
Hi,
I did check the logs and this what I found,
Failed to write to /tmp/.cloudera_health_monitoring_canary_files/.canary_file_2018_02_06-12_35_43. Error: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/.cloudera_health_monitoring_canary_files/.canary_file_2018_02_06-12_35_43 could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3442) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:686) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:217) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2222) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220)
Created 02-06-2018 09:40 AM
That indicates that there is something wrong in HDFS. The client could not write out blocks since there were no available DataNodes. I would check the health of your NameNode via the NameNode Web UI and verify that it shows the DataNodes are up and heartbeating.
Also, try using the command line to write a file to HDFS and see if you have a similar problem.
Created 02-14-2018 12:07 AM
The issue was with java heap space on the data nodes. Increasing the java heap space based on the data node block size resolved the issue.