This is a follow-up post on a previous topic about disk space not being freed up when deleting files from HDFS: http://community.cloudera.com/t5/Cloudera-Manager-Installation/Deleting-files-does-not-clear-disk-sp...
After finding out more about how this behaviour arises, it felt motivated to create a new thread in the HDFS section instead.
It seems like files that were added to HDFS prior to upgrading from the CDH 5 beta2 to the current version have their blocks duplicated. Running fsck /[path-to-arbitrary-old-file] -files -blocks -locations shows the file as if it were replicated by a factor 3. When searching for one of the file blocks in the local file system on one of the data nodes that hold it, there are two hits for each block replica; one under dfs/dn/current/BP-.../current/finalized/ and another one under dfs/dn/.../previous/finalized.
Now, when deleting the file from HDFS, only the block copies on the current path are actually deleted. The other ones are left on the data nodes, seeminlgy orphaned from the name node meta data. The hdfs log file only mentions deleting the current one and contains no error messages.
Files that were added after the HDFS upgrade do not behave the same way, so I suspect this has something to with the finalization of the metadata upgrade not working (or having been performed) properly.
As my previous post mentions, we are soon running out of disk space on the cluster, and would like to resolve this as soon as possible. Would it be safe to just manually remove all files under the previous paths from the local file systems of all data nodes? Or could the name node still be holding some reference to them that would become corrupt? Are there other ways to go about solving this?
Thanks for the reply!
The finalize command was alreade run after the update. I also tried rerunning it after identifying the cause of this issue. I think the problem here is that the NN for some reason did not clean up the finalized folder after itself. We have now deleted all of the orphanded blocks manually, and things have been working well since then.