Support Questions
Find answers, ask questions, and share your expertise

hdfs disk usage issues

Highlighted

hdfs disk usage issues

Hi All,

Need some help. When hdfs du -s h / command is executed i see the below user has 12 T of trash but when checking the list of files its empty. please find the below:

bash-4.2$ hadoop fs -ls /user/mzhou1/.Trash

-bash-4.2$ hadoop fs -du -s /user/mzhou1/.Trash

12248499878795 /user/mzhou1/.Trash

-bash-4.2$ hadoop fs -ls /user/mzhou1/.Trash

-bash-4.2$

It's not only with this user there are other directories also facing similar issue.. is it because of the snapshot stored in hdfs wherein blocks are allocated. Please provide inputs

3 REPLIES 3
Highlighted

Re: hdfs disk usage issues

@sudi ts

Yes, there might be Snapshot on that particular directory. Please delete all the snapshot of that particular directory and try after checkpoint.

Hope this helps you.

Highlighted

Re: hdfs disk usage issues

Hi,

thanks for the reply,

but i don't see snapshot for the below mentioned user id

-bash-4.2$ hdfs dfs -du -h /user/alin/

1.3 T /user/alin/.Trash

40.5 M /user/alin/.hiveJars

-bash-4.2$ hdfs dfs -du -h /user/alin/.snapshot

du: `/user/alin/.snapshot': No such file or directory

-bash-4.2$ hdfs dfs -ls /user/alin/.snapshot ls: `/user/alin/.snapshot': No such file or directory

I have extracted fsimage and using ELK to see the storage used by hdfs, the space consumption of certain directories don;t match with the actual size on the cluster.

Eg: /user/alin is consuming 1 TB of storage according hdfs dfs du command but in fsimage it shows that same user is consuming 40.5 M of storage.... on performing hdfs ls command on /user/alin i dont see any files of 1 TB

As checked with my operations they say because of hdfs snapshot saved on the cluster it gives those values as blocks are still allocated..but i don't see any doc in hortonworks mentioning that...

If thats the case how do i calculate the actual storage used by HDFS? does fsimage give accurate data

how to make sure fsimage also get snapshot details ?

Re: hdfs disk usage issues

Super Guru

@sudi ts

run the following:

hadoop fs -ls /user/mzhou1/.snapshot

then also run

fs -du -s /user/mzhou1/.snapshot