- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
block count warning still shows in cloudera manager even after deleting files from command line
- Labels:
-
Cloudera Manager
Created ‎10-17-2017 02:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello All,
We have CDH5.9.2 hadoop 13-node cluster and from one week, I am exeperiencing block count concerning warning notification in cloudera manager for all 6 datanodes.
We have multitenancy implemented in our cluster. So I deleted the older files via hdfs command line and block count is below the threshold value.
But still the block count concerning warning appears in the cloudera manager.
Please suggest.
Thanks,
Priya
Created ‎10-23-2017 10:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
from the fact that the DataNode's carrying too many replicas (of blocks).
The FSCK total block count is a number that does not include the replica
multiplier (x3, typically) and is global (across all DNs).
Have you looked at the Live DataNodes page on your NameNode UI as indicated
in the earlier post? It should have the reported replica count of every DN
alive in the cluster.
Created ‎10-17-2017 01:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Get the value of
a. CM -> HDFS -> Configuration -> DataNode Block Count Thresholds
b. CM -> HDFS -> WebUI -> Namenode Web UI -> Click on datanode menu -> Get the block count of your node
if b > a then you will get block count warning
also cloudera advice says "presence of many small files" also create this warning
action:
1. if it is not disturbing anything then you can ignore this warning but just keep an on eye on block pool usage percentage from 'b'
2. you can increase block count thresholds in 'a'
3. you can cleanup unwanted data, but if your trash folder maintains old data (for ex: 24 hrs) then you will see the result after 24 hours
4. add additional data nodes and apply rebalance
etc
Created ‎10-17-2017 07:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
b>a for all the datanodes.
I have cleaned up the unwanted data and in command line block count is showing below threshold value however it's not reflected in the cloudera manager console.
Please suggest.
Thanks,
Created ‎10-18-2017 07:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎10-18-2017 09:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It was b>a earlier.
I removed files using command line and block count is below threshold value. However it's not reflected in cloudera manager console.
Please suggest.
Thanks,
Created ‎10-19-2017 10:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do the information in these sources still indicate that each of the DataNodes have way lesser replicas than its alert threshold?
Created ‎10-23-2017 10:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply.
I am checking the block count by using the command hdfs fsck path and referring to total blocks column in the output.
Please suggest whether this is the right way or not.
Thanks,
Priya
Created ‎10-23-2017 10:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
from the fact that the DataNode's carrying too many replicas (of blocks).
The FSCK total block count is a number that does not include the replica
multiplier (x3, typically) and is global (across all DNs).
Have you looked at the Live DataNodes page on your NameNode UI as indicated
in the earlier post? It should have the reported replica count of every DN
alive in the cluster.
Created ‎10-23-2017 10:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for quick reply.
I thought the ouptut of fsck command includes replica multiplier and gives final total block count. Thanks for the clarification.
I checked Datanodes page on namenode WebUI and block count for each datanode is more than threshold value.
Thanks,
Priya
