Reply
Explorer
Posts: 17
Registered: ‎06-02-2014
Accepted Solution

hdfs du format change

I just switched from Cloudera 4 to Cloudera 5, and the output format for hdfs dfs -du has changed. It now has two columns instead of just one. I'm guessing that the first is the actual content size and the second is the block storage consumption, but I can't find any documentation about this. Can anyone clarify and/or point me the right direction?

Cloudera Employee
Posts: 578
Registered: ‎01-20-2014

Re: hdfs du format change

My output looks like this on both CDH4 and CDH5.

# sudo -u hdfs hdfs dfs -du /
20143 /accumulo
3230 /hbase
0 /solr
4 /tmp
287725904 /user


​Please paste your command and output here so we can have a look.​


Regards,
Gautam Gopalakrishnan
Explorer
Posts: 17
Registered: ‎06-02-2014

Re: hdfs du format change

I'm on Cloudera 5.3.3. Here's my command line and output:

 

[hadoop]$ hdfs dfs -du /
2298676940886 6896030822658 /output
21297905593 63893716779 /tmp
6072184915396 18216555409976 /user

Cloudera Employee
Posts: 578
Registered: ‎01-20-2014

Re: hdfs du format change

I checked on CDH 5.2 and it was the same as CDH4. The output you see was introduced in CDH 5.3, the jira being HADOOP-6857. The second column is supposed to calculate the raw disk usage of the file/directory.

 

So if a file is 42 bytes, it correctly shows (file size) * (replication factor of file) 

 

$ hadoop fs -du /hbase/hbase.id
42  126  /hbase/hbase.id

 

However it seems incorrect for directories as column 2 shows (theoretical block size) * (replication factor of file) * (number of files in directory)

 

$ hadoop fs -du -h /hbase/WALs
0    0      /hbase/WALs/hregion-67213513
83   384 M  /hbase/WALs/server54-2.abc.cloudera.com,22101,1431434548533
166  768 M  /hbase/WALs/server54-3.abc.cloudera.com,22101,1431434548804
83   384 M  /hbase/WALs/server54-4.abc.cloudera.com,22101,1431434547067

$ hadoop fs -ls -R /hbase/WALs/server54-2.abc.cloudera.com,22101,1431434548533
-rw-r--r--   3 hbase hbase         83 2015-05-12 18:42 /hbase/WALs/server54-2.abc.cloudera.com,22101,1431434548533/server54-2.abc.cloudera.com%2C22101%2C1431434548533.default.1431481363257

$ hadoop fs -ls -R /hbase/WALs/server54-3.abc.cloudera.com,22101,1431434548804
-rw-r--r--   3 hbase hbase         83 2015-05-12 19:42 /hbase/WALs/server54-3.abc.cloudera.com,22101,1431434548804/server54-3.abc.cloudera.com%2C22101%2C1431434548804..meta.1431484974652.meta
-rw-r--r--   3 hbase hbase         83 2015-05-12 19:42 /hbase/WALs/server54-3.abc.cloudera.com,22101,1431434548804/server54-3.abc.cloudera.com%2C22101%2C1431434548804.default.1431484963607

 

Regards,
Gautam Gopalakrishnan
Cloudera Employee
Posts: 578
Registered: ‎01-20-2014

Re: hdfs du format change

Announcements