Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Datanodes report block count more than threshold on datanode and Namenode

avatar
Expert Contributor

Hi,

 

I have 3 node cluster running on CentOS 6.7. Since a week I can see warning on all 3 nodes block count more than threshold. My Namenode is also used as DataNode.

 

Its more or less like this on all 3 nodes.

 

Concerning : The DataNode has 1,823,093 blocks. Warning threshold: 500,000 block(s). 

 

I know this means the problem of growing small files. I have website data (unstructured) on hdfs, they contain jpg, mpeg, css, js, xml, html types of data.

 

I dont know how to deal with this problem. Please help.

 

The output of the following command on NameNode is:

 

[hdfs@XXXXNode01 ~]$ hadoop fs -ls -R / |wc -l
3925529

 

Thanks,

Shilpa

 

 

14 REPLIES 14

avatar
Expert Contributor
Ok. Thanks.

avatar
Expert Contributor

What all I did:

 

1. Increased the memory of NN

2. Increased he disk of overall cluster

3. Increased dfs blocksize from 64MB to 128MB

4. Increased the block count threshold.

avatar
Champion

if you have Cloudera manager , you could easily find the problem as to which job is creating lot of stress on the storage . Please take a peek in to the below link 

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_dg_disk_usage_reports.html

 

https://www.cloudera.com/documentation/enterprise/latest/topics/admin_directory_usage.html#concept_l...

avatar

Hi All,

 

I recoment to check which application team is causing it by using #hdfs dfs -count -v -h /project/*
If the FILE_COUNT is more than 10M, then its problem for mid size of cluster.

 

Please check the below link to reduce the block count.

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ht_datanode.html#concept_uet_9pn_...

 

Reg,

Sandeep Kolli

 

avatar

To add,

 

For sizing a datanode heap it's similar to namenode heap, its recommend 1GB per 1M blocks. As a block could be as small a 1byte or as large as 128MB, the requirement of heap space is the same.