Created 09-28-2017 03:00 PM
Hi All,
Need some help. When hdfs du -s h / command is executed i see the below user has 12 T of trash but when checking the list of files its empty. please find the below:
bash-4.2$ hadoop fs -ls /user/mzhou1/.Trash
-bash-4.2$ hadoop fs -du -s /user/mzhou1/.Trash
12248499878795 /user/mzhou1/.Trash
-bash-4.2$ hadoop fs -ls /user/mzhou1/.Trash
-bash-4.2$
It's not only with this user there are other directories also facing similar issue.. is it because of the snapshot stored in hdfs wherein blocks are allocated. Please provide inputs
Created 09-30-2017 01:15 AM
Yes, there might be Snapshot on that particular directory. Please delete all the snapshot of that particular directory and try after checkpoint.
Hope this helps you.
Created 10-01-2017 03:06 PM
Hi,
thanks for the reply,
but i don't see snapshot for the below mentioned user id
-bash-4.2$ hdfs dfs -du -h /user/alin/
1.3 T /user/alin/.Trash
40.5 M /user/alin/.hiveJars
-bash-4.2$ hdfs dfs -du -h /user/alin/.snapshot
du: `/user/alin/.snapshot': No such file or directory
-bash-4.2$ hdfs dfs -ls /user/alin/.snapshot ls: `/user/alin/.snapshot': No such file or directory
I have extracted fsimage and using ELK to see the storage used by hdfs, the space consumption of certain directories don;t match with the actual size on the cluster.
Eg: /user/alin is consuming 1 TB of storage according hdfs dfs du command but in fsimage it shows that same user is consuming 40.5 M of storage.... on performing hdfs ls command on /user/alin i dont see any files of 1 TB
As checked with my operations they say because of hdfs snapshot saved on the cluster it gives those values as blocks are still allocated..but i don't see any doc in hortonworks mentioning that...
If thats the case how do i calculate the actual storage used by HDFS? does fsimage give accurate data
how to make sure fsimage also get snapshot details ?
Created 09-30-2017 03:34 AM
run the following:
hadoop fs -ls /user/mzhou1/.snapshot
then also run
fs -du -s /user/mzhou1/.snapshot