12-16-2014 05:00 AM
I just upgraded our cluster from CDH 5.0.1 to 5.2.1, using parcels and following the provided instructions.
After the upgrade has finished, the health test "Data Directory Status" is critical for one of the data nodes. The reported error message is "The DataNode has 1 volume failure(s)". By running 'hdfs dfsadmin -report' I can also confirm that the available HDFS space on that node is approximately 4 TB less than on the other nodes, indicating that one of the disks is not being used.
However, when checking the status of the actual disks and regular file system we can not find anything that seems wrong. All disks are mounted and seem to be working as they should. There is also an in_use.lock file in the dfs/nn directory on all of the disks.
How can I get more detailed information about which volume the DataNode is complaining about, and what the issue might be?