Support Questions

cjervis · ‎06-07-2015

Health test shows the following errors:

The health test result for HDFS_FREE_SPACE_REMAINING has become bad: Space free in the cluster: 0 B. Capacity of the cluster: 0 B. Percentage of capacity free: 0.00%. Critical threshold: 10.00%.
The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for /tmp/.cloudera_health_monitoring_canary_files.

We manually verified that space isn't an issue. Connectivity testing is success. No issues with kdc or principals.

Please help to explain root cause for this error message.

dice · ‎06-07-2015

Hi,

What did you make a change over the cluster before you see the message,
"Space free in the cluster: 0 B"?
How did you verify that the space is not the case? Can you also verify if
the DataNodes are up?
Are there actual blocks in DNs' local directories?

Cons_Big_Data · ‎06-07-2015

Thanks for responding.

This is a new cluster. DN is up & running. I verified space through CM as well as logging to the server themselves.

sathishkumar · ‎06-17-2015

Hi,

Have you checked the space in Name node's web UI ? Is it showing fine ?

Thanks,

Sathish

Thanks,
Sathish (Satz)

GaryAZ · ‎08-22-2015

I have the same issue with a brand new Cloudera Manager install on an AWS EC2 4 instance m4.xlarge cluster with 100GiB magnetic disk each.

Cloudera Manager Hosts view shows all 4 instances with a Disk Usage at 10.3-12.1 GiB / 115.6 GiB and "green" status.

The cluster is unuseable with HDFS in the resulting RED status.

What was the final resoltion on this?

GaryAZ · ‎08-22-2015

I verified the space by logging onto the the server and issuing the following command:

ubuntu@ip-172-31-29-49:~$ df -h

Filesystem Size Used Avail Use% Mounted on

/dev/xvda1 99G 8.3G 86G 9% /

none 4.0K 0 4.0K 0% /sys/fs/cgroup

udev 7.9G 12K 7.9G 1% /dev

tmpfs 1.6G 496K 1.6G 1% /run

none 5.0M 0 5.0M 0% /run/lock

none 7.9G 0 7.9G 0% /run/shm

none 100M 0 100M 0% /run/user

cm_processes 7.9G 14M 7.9G 1% /run/cloudera-scm-agent/process

As you can see there is plenty of space available.

What do you suggest as a next step?

Justin@cloudera · ‎11-18-2015

Usually this indicates the datanodes are not in contact with the name node. O bytes means there is no data nodes available to write to. Check the data node logs under /var/log/hadoo-hdfs

There will be some clues there, paste anything that springs to mind in the response here.

ditu · ‎04-26-2019

I've had the same issue. Just checked the logs on data nodes and they are successfully registering with NN

bgooley · ‎04-26-2019

@ditu,

This thread is super super old, so it would be best to confirm you are seeing the same issue. What message do you see regarding the canary test failure?

Basically, the Service Monitor will perform a health check of HDFS by writing out a file to make sure that completes. If it doesn't complete, then that could mean some problems with HDFS that requires review so this triggers a bad health state.

The canary test does the following:

creates a file
writes to it
reads it back
verifies the data
deletes the file

By default, the file name is:

/tmp/.cloudera_health_monitoring_canary_files

It is possible that the Service Monitor log (in /var/log/cloudera-scm-firehose) has some error or exception reflecting the failure.

Note that the operation of writing to a file in HDFS requires communication with the NameNode and then the DataNode that the NameNode tells the client to write the file to. Failures could occur in various places.

ditu · ‎04-29-2019

It was a user permissions issue.

All fixed now.

Thanks 🙂

Cloudera Community

Support Questions

HDFS goes in bad health