One of my hadoop data directory is full on all the cluster instance (same drive all the time ) 100% usage .
I have deleted almost all data in hdfs with skiptrash + expunge. I even try to reboot all boxes but still the directory is full on all cluster member
When i dive into the directory structure i can see that it is the hdfs blockpool area .
>hdfs dfs -du /
0 /mapred0 /mr-history
>hdfs dfs -df /
Filesystem Size Used Available Use%
hdfs://X:8020 412794792448 186773504950 149000060339 45%
if i go down into the data directory i end up finding blockpool file that are not known when you try to fsck them by blockId will others are.
blk_1073780387 blk_1073780392 blk_1073780395 blk_1073780463 blk_1073780475
blk_1073780387_39569.meta blk_1073780392_39574.meta blk_1073780395_39577.meta blk_1073780463_39645.meta blk_1073780475_39657.meta >hdfs fsck -locations -files -blockId blk_1073780463 Connecting to namenode via http://X.X.X.X:50070/fsck?ugi=hdfs&locations=1&files=1&blockId=blk_1073780463+&path=%2F
FSCK started by hdfs (auth:X) from /X.X.X.X at Mon Jan 22 14:30:02 GMT 2018
Block blk_1073780463 does not exist >
Anyone ever seen something like that, sound the file is deleted in namenode but not on the file system, is their a command to run to check that integrity and or can i delete any blk_nnnnn file if not known when doing fsck ?
Thanks in advance for your help