Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDFS Reports

avatar
Rising Star

Hello,
We are getting alerts for Block Count on one of our data nodes as it has crossed the threshold of 10000. Since HDFS balancer did not fix the issue, the next thing I turned my focus to see if we are hitting small files issue. I was trying to put up a report via terminal script ( hdfs dfs -ls -R /tmp |grep ^- |awk '{if ($5 < 134217728) print $5, $8;}'| head -5 | column –t) but when I compare the result from the script output vs HDFS Report from Cloudera Manager I see a difference in the size of the same file.

 

Could anyone provide any guidance / assistance on this, or am I doing something wrong.

 

Thanks

Amn

1 ACCEPTED SOLUTION

avatar
Expert Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
Cloudera Employee

Hi @Amn_468 I see you have received alert for Block Count on one of the data nodes as it has crossed the threshold of 10000 blocks. So this is basically controlled by Property "DataNode Block Count Thresholds" which alert you if any of the DataNode cross the specified number of blocks and so you can check and take necessary action whether to reduce the number of blocks by deleting unwanted files or the other option is to increase this threshold as with time cluster grows and the amount of data also grows with it. If you find all the blocks are legit and need to kept in the cluster you can simply increase the threshold this does not require any service restart. 

 

- Let me know if any further query or comment. 

avatar
Expert Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login