Created on 06-07-2015 04:50 AM - last edited on 11-08-2016 08:39 AM by cjervis
Health test shows the following errors:
The health test result for HDFS_FREE_SPACE_REMAINING has become bad: Space free in the cluster: 0 B. Capacity of the cluster: 0 B. Percentage of capacity free: 0.00%. Critical threshold: 10.00%.
The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for /tmp/.cloudera_health_monitoring_canary_files.
We manually verified that space isn't an issue. Connectivity testing is success. No issues with kdc or principals.
Please help to explain root cause for this error message.
Created 06-07-2015 05:38 AM
Created 06-07-2015 06:35 AM
Thanks for responding.
This is a new cluster. DN is up & running. I verified space through CM as well as logging to the server themselves.
Created 06-17-2015 06:03 AM
Hi,
Have you checked the space in Name node's web UI ? Is it showing fine ?
Thanks,
Sathish
Created on 08-22-2015 08:47 AM - edited 08-22-2015 10:41 AM
I have the same issue with a brand new Cloudera Manager install on an AWS EC2 4 instance m4.xlarge cluster with 100GiB magnetic disk each.
Cloudera Manager Hosts view shows all 4 instances with a Disk Usage at 10.3-12.1 GiB / 115.6 GiB and "green" status.
The cluster is unuseable with HDFS in the resulting RED status.
What was the final resoltion on this?
Created on 08-22-2015 03:09 PM - edited 08-22-2015 03:09 PM
I verified the space by logging onto the the server and issuing the following command:
ubuntu@ip-172-31-29-49:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 99G 8.3G 86G 9% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 7.9G 12K 7.9G 1% /dev
tmpfs 1.6G 496K 1.6G 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 7.9G 0 7.9G 0% /run/shm
none 100M 0 100M 0% /run/user
cm_processes 7.9G 14M 7.9G 1% /run/cloudera-scm-agent/process
As you can see there is plenty of space available.
What do you suggest as a next step?
Created 11-18-2015 03:07 AM
Usually this indicates the datanodes are not in contact with the name node. O bytes means there is no data nodes available to write to. Check the data node logs under /var/log/hadoo-hdfs
There will be some clues there, paste anything that springs to mind in the response here.
Created 04-26-2019 06:27 AM
Created 04-26-2019 02:49 PM
This thread is super super old, so it would be best to confirm you are seeing the same issue. What message do you see regarding the canary test failure?
Basically, the Service Monitor will perform a health check of HDFS by writing out a file to make sure that completes. If it doesn't complete, then that could mean some problems with HDFS that requires review so this triggers a bad health state.
The canary test does the following:
By default, the file name is:
/tmp/.cloudera_health_monitoring_canary_files
It is possible that the Service Monitor log (in /var/log/cloudera-scm-firehose) has some error or exception reflecting the failure.
Note that the operation of writing to a file in HDFS requires communication with the NameNode and then the DataNode that the NameNode tells the client to write the file to. Failures could occur in various places.
Created 04-29-2019 03:10 AM
It was a user permissions issue.
All fixed now.
Thanks 🙂