Created on 06-07-2015 04:50 AM - last edited on 11-08-2016 08:39 AM by cjervis
Health test shows the following errors:
The health test result for HDFS_FREE_SPACE_REMAINING has become bad: Space free in the cluster: 0 B. Capacity of the cluster: 0 B. Percentage of capacity free: 0.00%. Critical threshold: 10.00%.
The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for /tmp/.cloudera_health_monitoring_canary_files.
We manually verified that space isn't an issue. Connectivity testing is success. No issues with kdc or principals.
Please help to explain root cause for this error message.
Created 05-06-2019 09:55 AM
could you please explain more, how you fix this issue.
Created 07-31-2019 04:28 AM
What permissions was changed to correct the issue
Created 07-31-2019 02:07 PM
While we can't be sure, it is likely that some permissions were changed on the /tmp directory so that the Service Monitor (that executes the HDFS canary health check) could not access the directory. Service Monitor utilizes the "hue" user and principal to access other resources so it is reasonable to assume that /tmp in HDFS did not allow the hue user or group to write to /tmp.
Are you having similar trouble? If so, check your service monitor log file for stack traces and errors related to the hdfs canary.
Created 05-22-2019 07:22 AM
Hi,
I have the same issue and I have looked in the service manager log and it says it is failing to connect to the server - connection refused.
Also detected pause in JVM or host machine
I am quite new to cloudera manager and hdfs so is there a way I can check the connection and reconnect the server?
Thanks,
Jess
Created 05-22-2019 10:10 AM
Hi @jess ; welcome to the Cloudera Community.
In order to be sure we understand what you are seeing, please share a screen shot or two that shows us what you are seeing so that we can have a better understanding of the problem you are seeing.
Make sure you click on the HDFS service and then look at the Instances tab to see what HDFS roles are in bad health. Also look at the "Health Tests" section to see if anything is reported there. Click on any roles that are in bad health to see more information about what health tests are failing.
Also, good job looking at the Service Monitor log for clues. Can you show us the stack trace or log messages that say "connection refused?" The Service Monitor makes connections to several servers, so it is important to know to which it was connecting when the connection refused error occurred.
Thanks!