- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
DATA_NODE_BLOCK_COUNT threshold 200,00 block(s)?
Created on ‎05-08-2014 09:27 PM - edited ‎09-16-2022 01:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Using CM 4.8.2 & CDH 4.6.0,
All dataNode(3) health concern I got this warning message.
"The health test result for DATA_NODE_BLOCK_COUNT has become concerning: The DataNode has 200,045 blocks. Warning threshold: 200,000 block(s)."
To solve this problem, can I just increase the limit to 300,000 block(s)?
any reasons threshold value is 200,000 block(s)?
Created ‎07-19-2014 10:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Having more number of blocks raises the heap requirement at the DataNodes. The threshold warning exists to also notify you about this (that you may need to soon raise the DN heap size to allow it to continue serving blocks at the same performance).
With CM5 we have revised the number to 600k, given memory optimisation improvements for DNs in CDH4.6+ and CDH5.0+. You can feel free to raise the threshold via the CM -> HDFS -> Configuration -> Monitoring section fields, but do look into if your users have begun creating too many tiny files as it may hamper their job performance with overheads of too many blocks (and thereby, too many mappers).
Created ‎07-19-2014 10:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Having more number of blocks raises the heap requirement at the DataNodes. The threshold warning exists to also notify you about this (that you may need to soon raise the DN heap size to allow it to continue serving blocks at the same performance).
With CM5 we have revised the number to 600k, given memory optimisation improvements for DNs in CDH4.6+ and CDH5.0+. You can feel free to raise the threshold via the CM -> HDFS -> Configuration -> Monitoring section fields, but do look into if your users have begun creating too many tiny files as it may hamper their job performance with overheads of too many blocks (and thereby, too many mappers).
Created ‎10-07-2014 10:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your response.
I deleted useless HDFS files(3TB) yesterday(hadoop fs -rm -r), but warning messege is still continuous.
DATA_NODE_BLOCK_COUNT is same before deleting files. (current value is 921,891 blocks)
How can I reduce current DATA_NODE_BLOCK_COUNT?
Created ‎10-16-2014 04:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-latest/Cloudera-Manage...
Gautam Gopalakrishnan
Created ‎11-18-2016 07:18 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Harsh,
In this thread you stated "but do look into if your users have begun creating too many tiny files as it may hamper their job performance with overheads of too many blocks (and thereby, too many mappers)." Too may tiny files is in the eye of the beholder if those files are what get you paid.
I'm also seeing a block issue on wo of our nodes, but a rebalance to 10% has no effect. I've rebalanced to 8% and it improves, but I suspect we're running into a small files issue.
