Support Questions

Rajukiran · ‎02-06-2018

Hi,

Getting frequent alerts on HDFS health test. Canary test failed to write file in directory /tmp/.cloudera_health_monitoring_canary_files. Checked the permissions, everything seems to be good and I could see files being created as well.

bgooley · ‎02-06-2018

@Rajukiran,

Check the logs for more information:

Service Monitor (/var/log/cloudera-scm-firehose/*SERVICE* file)... look at the latest one. Service Monitor initiates the canary test, so if it fails, there is likely more information or a stack trace
Active NameNode (/var/log/hadoop-hdfs/*NAMENODE* file). Look at the time the failure was logged in the Service Monitor log and then go look at the NameNode log on the active namenode host at the same time. If HDFS returned an error, then its hould be logged there too.

Rajukiran · ‎02-06-2018

Hi,

I did check the logs and this what I found,

Failed to write to /tmp/.cloudera_health_monitoring_canary_files/.canary_file_2018_02_06-12_35_43. Error: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/.cloudera_health_monitoring_canary_files/.canary_file_2018_02_06-12_35_43 could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3442)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:686)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:217)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2222)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220)

bgooley · ‎02-06-2018

@Rajukiran,

That indicates that there is something wrong in HDFS. The client could not write out blocks since there were no available DataNodes. I would check the health of your NameNode via the NameNode Web UI and verify that it shows the DataNodes are up and heartbeating.

Also, try using the command line to write a file to HDFS and see if you have a similar problem.

Rajukiran · ‎02-14-2018

The issue was with java heap space on the data nodes. Increasing the java heap space based on the data node block size resolved the issue.

Cloudera Community

Support Questions

Canary test failed to write file in directory /tmp/.cloudera_health_monitoring_canary_files

HDFS_CANARY_HEALTH has become bad: Canary test fai...

HDFS tests (canary, connection) fail after enablin...

Testing Spark write performance with Spark version...

Write or Append failures in very small Clusters, u...

Error creating done directory: [file:/user/history...

Ports used when doing zookeeper canary test

Writing files to Cloudera Machine Learning using A...

[User Home Directory Test] failed when creating Wo...

ConvertRecord fails for some files

HBase Writes Fails With .FailedSanityCheckExceptio...